What is ensemble learning?
It is a method of fusing multiple models that have been trained individually in order to improve the generalization performance of the model.
The three main sources of error in trained models are noise, bias, and variance.
Ensemble learning minimizes these and improves the generalization performance.
In this article, the following methods are explained
Bagging - Bagging
It is called Bagging, taking the emphasized letter part of Bootstrap aggregating.
Bootstrapping is a method of creating multiple datasets by randomly retrieving data from a dataset during training, and then training that data to create multiple models.
- randomly retrieve n instances from the original dataset and create n bootstrap samples that are slightly different from each other. 2.
Each bootstrap sample is trained in parallel to create n models.
At this time, the data set that was not selected by the bootstrap is used as test data when voting in step 3. This unselected data is called Out-Of-Bag (OOB).
The Out-Of-Bag will be about 36% of the original data set. 3.
In the resulting n training units, inference is performed on each model using the OOB.
The output results are then used for majority voting. 4.
- fuse them together to create a single learner.
Bagging can be trained in parallel, so it tends to be computationally faster. It is also less prone to over-learning, but in some cases the accuracy is inferior to that of boosting, which will be discussed later.
- Random Forest
Boosting - Boosting
In contrast to bagging, which uses bootstrap sampling to gradually change the data used for learning, boosting learns by gradually changing the data by weighting the acquired data to create a learner.
- extract data from the original data set to create a learning machine. 2.
- make predictions using the learner created here. 3.
- learn to predict correctly, giving priority to the data that predicted incorrectly.
This process is repeated to create a learning machine by averaging the final learning machines.
Because boosting is a continuous computation, it requires a longer learning time but may perform better than bagging. However, over-learning may occur if the number of learners is increased too much.
Stacking - Stacking
As the name suggests, this is a method of stacking learners.
The process is divided into two levels.
In Level 1, the data set is trained using multiple algorithms to create a model.
Any algorithm can be used here (Random Tree, XGBoost, LightBGM, etc.).
The next step is to make predictions using the learner you have created.
In Level 2, the predicted values of each learner are trained as "features" to create a final learner.
Ensemble Learning in VARISTA
Each of the ensemble learning methods can also be performed in VARISTA.
For more information, please see Learning Template.