datarekha
Machine Learning Medium Asked at GoogleAsked at AmazonAsked at MetaAsked at MicrosoftAsked at Stripe

What is the difference between bagging and boosting, and what error component does each primarily reduce?

The short answer

Bagging trains many independent models on bootstrap samples in parallel and averages their predictions, primarily reducing variance. Boosting trains models sequentially, each correcting the errors of its predecessor, primarily reducing bias.

How to think about it

The bias-variance decomposition of generalisation error says:

Error = Bias² + Variance + Irreducible noise

Bagging and boosting attack different terms.

Bagging (Bootstrap Aggregating)

Each base learner is trained on an independent bootstrap sample (random draw with replacement) of the training set. Predictions are combined by majority vote (classification) or averaging (regression). Because the models are trained in parallel and independently, their errors are approximately uncorrelated, and averaging reduces variance.

Boosting

Models are trained sequentially. Each new learner focuses on the residuals (or re-weighted hard examples) left by the ensemble so far. The ensemble slowly corrects systematic errors, reducing bias. Examples: AdaBoost, Gradient Boosting, XGBoost.

Bagging (parallel)Data 1Data 2Data 3Model 1Model 2Model 3Average / VoteBoosting (sequential)Model 1ResidualsModel 2Model 3
Bagging trains models in parallel on bootstrap samples; boosting trains sequentially on residuals.
from sklearn.ensemble import BaggingClassifier, GradientBoostingClassifier
from sklearn.tree import DecisionTreeClassifier

# Bagging
bag = BaggingClassifier(
    estimator=DecisionTreeClassifier(max_depth=None),
    n_estimators=100,
    bootstrap=True,
    n_jobs=-1
)

# Boosting
boost = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, max_depth=3)
Learn it properly Random forests

Keep practising

All Machine Learning questions

Explore further

Skip to content