Ensemble way of Machine Learning. What is it? How is it done?

Mayuresh Madiwale
Jun 6, 2022
2 min read

Ensemble means "A group" or "A bunch". Ensemble learning is a technique of Machine Learning to employ a bunch of weak learning models to take a final decision. This not only helps to improve prediction but also to reduce bias and variance.

“A less common strategy for manipulating the search space is to manipulate the input attribute set. Feature subset-based ensemble methods are those that manipulate the input feature set in order to create the ensemble members. The idea is simply to give each classifier a different projection of the training set.”

― Lior Rokach, Ensemble Learning: Pattern Classification Using Ensemble Methods

What are the various types of models used in Ensemble Learning ?

Simple : Max Voting, Averaging, Weighted Averaging
Advanced : Stacking, Bagging & Boosting

We'll see the above types with small and simple illustrations:

Image credit: Medium.com

Simple Ensemble Methods

1. Max Voting

Mode of all predictions.

Model 1	Model 2	Model 3	Model 4	Model5	Result
5	4	5	4	4	4

2. Average Voting

Average of all values.

Model 1	Model 2	Model 3	Model 4	Model 5	Result
5	4	5	4	4	4.4

3. Weighted Averaging Method

Averaging all values after giving some weights to all predictions

Formula: ((w1*P1)+(w2*P2)+(w3*P3)+(w4*P4)+(w5*P5))

where, w = weight

P = prediction value

	Model 1	Model 2	Model 3	Model 4	Model 5	Result
Weights	0.2	0.18	0.25	0.22	0.21
Predictions	5	4	5	4	4	4.69

Advanced Ensemble Methods

1. Stacking Method

As the name suggests, it means stacking models onto each other. Usually a weak learner is stacked over a meta-learner.

2. Bootstrap Aggregating Method (Bagging)

In this method, a bunch of weak learning models are trained on number of Non-Overlapping subsets of original dataset in parallel. The label with the greatest number of predictions is selected as the prediction.

Bagging Algorithms :

Bagging meta-estimator
Random Forest

3. Boosting Method

A bunch of weak learners are fed a complete dataset in a sequential manner. Error terms in first model are fed to next with higher weight so that they are easily separable from other values and more emphasis is given on them. Again the process continues till we get a low biased prediction. Finally, voting is done on results to get result.

Boosting Algorithms –

AdaBoost
GBM
XGBoost
LightGBM
CatBoost

Key Takeaways:

Ensemble Learning is to employ many weak learners.
In Stacking, weak learners are stacked over Meta Learners
In Bagging, weak learners are trained by Non Overlapping subset of data.
In Boosting, weak learners are trained sequentially with same data after adding weights on error terms.