2021.02.26 (Fri)

A template is a concept that records and stores the training settings during model training.
VARISTA provides a total of nine templates: three for classification, three for multilevel classification, and three for regression.

You can create a new template for each project or duplicate the original one. If you want to use them in another project, you can duplicate them across projects.

Learning with Templates

VARISTA uses templates for model training.

! [VARISTA Documents Template Management 0](//images.ctfassets.net/8qlu80sl3ynp/5dASYaCWzBy6g0WJ282wSk/c6b5f461166a2d4f263578af130044fb/ VARISTA_Documents_Template_Management_0.png)

Example of creating a classification model

  • Basic Ensemble Classifier
  • XGBoost Classifiter

Example for creating a regression model

  • Autopilot for Regression
  • XGBoost Regressor
  • Basic Ensemble Regressor

Managing Templates

To create or edit a template, from within any project, navigate to the template management screen by doing the following

Modeling › Templates (tab)

! [VARISTA Documents Template Management 1](//images.ctfassets.net/8qlu80sl3ynp/GSpFkUIx7hXmq2loGxkIW/1cd4c52b826af073cc905581e827be46/ VARISTA_Documents_Template_Management_1.png)

Create a new template

When you go to the template management page, you can create a new template from New.
! [VARISTA Documents Template Management 2](//images.ctfassets.net/8qlu80sl3ynp/2hEYUKRxS0YtvxlVvTlAlD/7013b7c3da4290a6099579e9d376d5c7/ VARISTA_Documents_Template_Management_2.png)

Basic Settings

  • Objective: Classification / Regression
  • Data preprocessing:
    • Handling of missing values: Delete / Simple completion (completion by mean, mode)
      Delete: delete columns with missing values
      Simple storage: Numeric columns are completed using mean, categorical columns are completed using mode.
    • Conversion of categorical columns __: Label Encoding / One-Hot Encoding
      Convert categorical columns using one of the methods
  • Split settings for validation data__.
    • Split Size: (Number) Specify the ratio of splitting training data and test data.
    • Shuffle: (Boolean) Specify whether to shuffle the data.
    • Random seed: (Number) RandomState to shuffle the data.
      The maximum value in VARISTA is 999999999.

Model Settings

! [VARISTA Documents Template Management 3](//images.ctfassets.net/8qlu80sl3ynp/6xaqcisYu0d8EwgEF0GObp/d8bd51a33e2455b54af5df6ab12b6a50/ VARISTA_Documents_Template_Management_3.png)

  • Learning Type.
    • Single: Use a single algorithm to train the model.
    • Auto Selection: Use VARISTA's AutoML feature to train models.
      The maximum number of models that can be used for ensemble training is 32, but depending on the data size, a memory error may occur during training.
    • Ensemble: Model training is performed using ensemble learning, which fuses multiple models.
      The currently supported ensemble learning methods are Stacking and Voting.
      For Boosting and Bagging, please use algorithms such as XGBoost with Single.
      The maximum number of models that can be used for ensemble training is 32, but depending on the data size, memory errors may occur during training.
  • Algorithm: Specify the algorithm to be used for model creation. The available algorithms are as follows
    • XGBoost
    • LightGBM
    • CatBoost
    • Linear
    • Ridge
    • Ridge CV (CV: Cross Validation)
    • AdaBoost
    • Extra Tree
    • Gradient Boosting
    • Random Forest
    • Hist Gradient Boosting
  • Parameters: Specifies the tuning settings for hyperparameters.
    If AutoTune is not specified, any parameter can be set for all items; if AutoTune is specified, hyperparameters are automatically searched, and the range to be searched is specified.
    • Auto Tune: Specifies the algorithm for automatic hyperparameter search.
      • Grid Search Optimization: Uses grid search to search for hyperparameters.
      • Randomized Search Optimization: Searches for hyperparameters using grid search.
      • TPE Optimization with Hyperopt[1]: Hyperopt is used to search for hyperparameters.
      • TPE Optimization with Optuna[2]: Perform hyperparameter search using Optuna
    • AutoTune rounds: Specifies the number of trials for automatic hyperparameter search.
    • AutoTune CV Settings: Sets the cross-validation settings for automatic hyperparameter search.
      n_splits: (number) Number of cross-validation splits (1-10)
      Shuffle: (Boolean) Whether or not to shuffle the data during cross-validation (True/False)
      random_state: (Number) Set random_state for shuffling data (Maximum value: 9999999999)

Duplicate template

Select the meatball menu displayed to the right of any template in the template list, and select Duplicate.
! [VARISTA Documents Template Management 4](//images.ctfassets.net/8qlu80sl3ynp/6gDcaVLWVLcd7flxiUBtFw/34cb9a98f3b457d8248f297283b22c78/ VARISTA_Documents_Template_Management_4.png)
You can duplicate it to any project by selecting the destination project.
! [VARISTA Documents Template Management 5](//images.ctfassets.net/8qlu80sl3ynp/5tTZjkMEI8RGILfep1YIpR/da17c139d2a04aac5e4e20d84b09ce5d/ VARISTA_Documents_Template_Management_5.png)

Edit template

To edit a template, select any template from the template list.
To edit a template, select any template from the list of templates, and after changing the values, select Save to save the changes.

Delete template

To delete a template, select the meatball menu of any template from the template list and click Delete.
To delete a template, select Delete from the template list, or open the template details and select Delete Template at the bottom of the screen.
Please note that once a template is deleted, it cannot be restored.

  1. Hyperopt
    https://github.com/hyperopt/hyperopt ↩︎

  2. Optuna
    It is an open source hyperparameter auto-optimization framework developed by Preferred Networks Inc.
    https://www.preferred.jp/ja/projects/optuna/ ↩︎

Made with
by VARISTA Team.
© COLLESTA, Inc. 2021. All rights reserved.