Bagging vs boosting — how do they differ, and when does each help?

Bagging trains many independent models in parallel on bootstrap samples and averages them, which mainly reduces variance; boosting trains models sequentially so each corrects its predecessor's errors, which mainly reduces bias. Use bagging (e.g. random forests) when your base learner is high-variance and overfits; use boosting (e.g. gradient boosting) when you need to squeeze out bias and maximize accuracy, accepting more tuning and overfitting risk.

Random forest vs gradient boosting — which would you choose and why?

Random forest builds deep trees independently in parallel and averages them, making it robust, low-tuning, and resistant to overfitting; gradient boosting builds shallow trees sequentially to correct residual errors, usually achieving higher accuracy when carefully tuned. Choose random forest for a fast, stable baseline on noisy data, and gradient boosting when squeezing out maximum accuracy on tabular data is worth the tuning effort.

What is the difference between bagging and boosting, and what error component does each primarily reduce?

Bagging trains many independent models on bootstrap samples in parallel and averages their predictions, primarily reducing variance. Boosting trains models sequentially, each correcting the errors of its predecessor, primarily reducing bias.

When would you choose a random forest over gradient boosting (XGBoost/LightGBM), and vice versa?

Random forests are faster to train, easier to tune, robust to noisy features, and hard to overfit with more trees — making them a strong default baseline. Gradient boosting typically achieves higher accuracy on structured/tabular data, but requires careful tuning of learning rate, tree depth, and early stopping to avoid overfitting.

Bagging, boosting & stacking — Machine Learning

If you’ve wondered why random forests and XGBoost dominate tabular ML, here’s the unifying answer: ensembles. A committee of models, combined well, beats any single one — and nearly every winning Kaggle solution is an ensemble. This lesson is the theory that ties the tree methods together.

Why a committee wins

The intuition is the wisdom of crowds: if you average many models that each make different, uncorrelated errors, the errors cancel and the consensus is more accurate than any individual. The crucial word is uncorrelated — ten copies of the same model add nothing. Diversity is the whole game. Ensembles work precisely to the degree their members are wrong in different ways.

Bagging — parallel, cuts variance

Bagging (bootstrap aggregating) trains many models in parallel, each on a different bootstrap sample (a random resample of the data, with replacement), then averages them. Because each model sees a slightly different dataset, they make different errors — and averaging cancels the noise, sharply reducing variance. Each bootstrap sample draws the same number of rows with replacement, so some rows repeat and others are left out entirely:

Each model in a bag trains on a different bootstrap sample — duplicates and omissions are what make the models disagree.

A random forest is exactly this: bag decision trees, and also randomize the features each split considers, which decorrelates the trees even more.

Boosting — sequential, cuts bias

Boosting flips the idea: train models one after another, each new one focused on the examples the ensemble got wrong so far. Instead of averaging independent models, it builds an additive sequence that keeps correcting its own mistakes — which reduces bias and produces the extremely accurate models you saw in XGBoost.

Three ways to combine models: bag in parallel, boost in sequence, or stack with a meta-learner.

Stacking & voting — blend different families

The third family combines different model types. Voting just averages (or majority-votes) their predictions. Stacking goes further: it trains a small meta-model on the base models’ predictions, learning how to weight them. Because a tree, a linear model, and a k-NN make very different errors, blending them often beats any one — which is why multi-level stacking routinely wins Kaggle competitions.

from sklearn.ensemble import VotingClassifier, HistGradientBoostingClassifier, RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import cross_val_score
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=1200, n_features=20, n_informative=8, random_state=0)

models = {
    "logistic": make_pipeline(StandardScaler(), LogisticRegression(max_iter=500)),
    "forest":   RandomForestClassifier(n_estimators=200, random_state=0),
    "boosting": HistGradientBoostingClassifier(random_state=0),
}
for name, m in models.items():
    print(f"{name:9} {cross_val_score(m, X, y, cv=5).mean():.3f}")

# A soft-voting ensemble of the three diverse models:
ens = VotingClassifier([(k, v) for k, v in models.items()], voting="soft")
print(f"{'ENSEMBLE':9} {cross_val_score(ens, X, y, cv=5).mean():.3f}  <- usually beats each one")

In one breath

A committee of models beats any single one — if their errors are uncorrelated; diversity is the whole game (ten copies of one model add nothing).
Bagging trains models in parallel on bootstrap samples and averages them → cuts variance (a random forest = bagged trees + feature randomization).
Boosting trains models sequentially, each fixing the last’s mistakes → cuts bias (XGBoost and friends).
Stacking / voting blend different families (tree + linear + k-NN): voting averages, stacking trains a meta-model to learn the weights.
The ladder: one XGBoost is a great baseline; a small voting/stacking ensemble of diverse models wins the last percent — blend across families, seeds, and feature subsets.

Quick check

0/3

Q1What is the key requirement for an ensemble to outperform its members?

Q2What's the difference between bagging and boosting?

Q3What does stacking add over simple voting?

That completes the supervised core. Next, evaluation done rigorously — feature selection and model selection with nested CV.

Bagging, boosting & stacking

What you'll learn

Before you start

Why a committee wins

Bagging — parallel, cuts variance

Boosting — sequential, cuts bias

Stacking & voting — blend different families

In one breath

Quick check

Quick check

Next

Sign in to track your progress

Practice this in an interview

Related lessons

Explore further