A tutorial on creating a contract prediction model

PUBLISHED
2021.02.03 (Wed)
CATEGORY
Tutorials
TOPIC
READING TIME
11分41秒

This article has been translated from Japanese into English using DeepL.

This article explains how to use VARISTA to build a model to predict telemarketing success.

About the Dataset

In this tutorial, we will use a dataset from the Bank of Portugal's telemarketing campaign from May 2008 to June 2013, which is available at UCI Machine Learning.
You may have seen this dataset before as it is often used in tutorials.

This dataset contains information on whether or not a customer actually signed up for a time deposit based on the telemarketing campaign.
The dataset contains 41,188 pieces of customer information organized by 21 characteristics, as shown in Table 1 below.
Table 1
blog tutorial dataset exp

Download the dataset from here.
Bank Marketing Data Set
After the page transition, go to the Data Folder and download the data-additional.zip file.
blog tutorial dataset

Uploading data to VARISTA

Create a new project and upload the data you have downloaded.
Once the data has been analyzed, set the prediction column to "y".

Understanding the data

Use VARISTA's Visualize function to review your data.
Select the data you have uploaded and choose Visualize.
blog tutorial datalist
blog tutorial visualize 01
When you check the visualize, you can see the age distribution, occupation, married or unmarried, and other information.

blog tutorial visualize 02

By selecting Correlation, you can check the correlation between the target variable and each of the features.
Housing and loan do not seem to have much effect on the contract.
blog tutorial visualize 03

After checking the duration, it seems that the longer the call, the more it affects the contract.
! [blog tutorial visualize 04](//images.ctfassets.net/8qlu80sl3ynp/37ExVyI6RRmXFxe3lk7EFu/cd9add21ffee7ba1bd8660bdc67542c4/blog_ tutorial_visualize_04.png)

As you can see, VARISTA's visualize function allows you to immediately visualize and check the distribution and correlation of your data.

Creating a Predictive Model

We will actually create a predictive model in AutoML, which is included in VARISTA.
In this case, we will use a binary classification of "contract" and "no contract".
VARISTA will automatically determine whether to create a regression or classification (binary or multi-level) model.
Select a model from the left menu and click on the "Create AI Model" button.
Make sure that bank-additional-full.csv is selected and the column to be predicted is set to "y", then click the "Start Learning" button.
VARISTA will automatically create the model using the AutoML function.
blog tutorial training

After a while, the training will be completed.
blog tutorial training proceed
When the training is complete, the model details will be displayed.
blog tutorial model review
Let's check each panel.
The score for the model itself, as calculated by VARISTA, is shown as 65.
The overall score, the percentage of people who guessed that they would sign a contract, and the percentage of people who guessed that they would not sign a contract are shown.
VARISTA does cross-validation when it generates the model, and the results are shown here.
blog tutorial model review 01

The features that had a high impact on whether or not to sign up for a term deposit are labeled duration.
blog tutorial dataset 01

The percentage of data split for cross-validation and the confusion matrix are also displayed like this.
This model uses 20% of the total data as test data.
The model uses 20% of the data as test data (the percentage can be changed in the training settings).

Checking the confusion matrix, it seems that there are 218 cases where the model predicts that a customer who actually signed a contract will not sign a contract. This is about 23.5% of the total, which is a relatively large number.
The reason is that the data is biased.
blog tutorial model review 02

The above is the process of creating a model using VARISTA.
We encourage you to download the dataset and try it out with VARISTA.


VARISTAは機械学習モデルの開発、管理をノーコードで効率的に行うことができる新しいプラットフォームです。
データをお持ちでしたらすぐに始められますので、是非ともご相談ください。
Made with
by VARISTA Team.
© COLLESTA, Inc. 2021. All rights reserved.