What is the difference between bagging and boosting, and what error component does each primarily reduce?
Bagging trains many independent models on bootstrap samples in parallel and averages their predictions, primarily reducing variance. Boosting trains models sequentially, each correcting the errors of its predecessor, primarily reducing bias.
How to think about it
The bias-variance decomposition of generalisation error says:
Error = Bias² + Variance + Irreducible noise
Bagging and boosting attack different terms.
Bagging (Bootstrap Aggregating)
Each base learner is trained on an independent bootstrap sample (random draw with replacement) of the training set. Predictions are combined by majority vote (classification) or averaging (regression). Because the models are trained in parallel and independently, their errors are approximately uncorrelated, and averaging reduces variance.
Boosting
Models are trained sequentially. Each new learner focuses on the residuals (or re-weighted hard examples) left by the ensemble so far. The ensemble slowly corrects systematic errors, reducing bias. Examples: AdaBoost, Gradient Boosting, XGBoost.
from sklearn.ensemble import BaggingClassifier, GradientBoostingClassifier
from sklearn.tree import DecisionTreeClassifier
# Bagging
bag = BaggingClassifier(
estimator=DecisionTreeClassifier(max_depth=None),
n_estimators=100,
bootstrap=True,
n_jobs=-1
)
# Boosting
boost = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, max_depth=3)