Machine Learning Medium Asked at GoogleAsked at MetaAsked at AmazonAsked at MicrosoftAsked at Spotify

When should you use grid search vs random search vs Bayesian optimisation for hyperparameter tuning?

For Data Scientist ML Engineer AI / LLM Engineer

The short answer

Grid search exhaustively tries every combination in a predefined grid, which is only practical for 1–2 hyperparameters. Random search samples combinations uniformly at random and finds good values faster per compute budget, especially when only a few hyperparameters actually matter. Bayesian optimisation fits a surrogate model of the objective and proposes the next trial intelligently, giving the best sample efficiency for expensive evaluations.

How to think about it

The right choice depends on how many hyperparameters matter, how cheap each evaluation is, and whether you have a compute budget or a time budget.

Grid search

Tests every combination on a discrete grid. With 5 hyperparameters each having 5 values, that is 5^5 = 3,125 CV evaluations. Combinatorial explosion makes it impractical beyond 2–3 hyperparameters, and it wastes budget testing values of irrelevant hyperparameters.

Random search

Samples hyperparameter combinations uniformly at random. A key insight from Bergstra & Bengio (2012): if only 2 of 20 hyperparameters matter, grid search wastes budget repeating the same values of the 18 irrelevant ones, while random search effectively covers the 2 that matter with every trial. In practice, random search matches or beats grid search with far fewer evaluations.

Bayesian optimisation

Maintains a surrogate model (Gaussian Process or Tree-structured Parzen Estimator) of hyperparameter → CV score. At each step it uses an acquisition function (e.g., expected improvement) to select the next point that trades off exploration (uncertain regions) and exploitation (near-optimal regions). This makes it the most sample-efficient method — critical when each evaluation takes hours (deep learning, large ensembles).

# Grid search (small grids only)
from sklearn.model_selection import GridSearchCV
gs = GridSearchCV(estimator, {"C": [0.01, 0.1, 1, 10], "gamma": ["scale", "auto"]}, cv=5)

# Random search (budget-constrained)
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import loguniform
rs = RandomizedSearchCV(estimator, {"C": loguniform(1e-3, 1e3)}, n_iter=60, cv=5, random_state=42)

# Bayesian optimisation (expensive evaluations)
from skopt import BayesSearchCV
from skopt.space import Real
bs = BayesSearchCV(estimator, {"C": Real(1e-3, 1e3, prior="log-uniform")}, n_iter=40, cv=5)

When to use each

Method	Best when
Grid search	≤ 2 hyperparameters, small discrete ranges
Random search	3+ hyperparameters, compute budget matters
Bayesian optimisation	Each CV fold is expensive (minutes–hours)

Learn it properly Hyperparameter tuning

When should you use grid search vs random search vs Bayesian optimisation for hyperparameter tuning?

Grid search

Random search

Bayesian optimisation

When to use each

Keep practising

Explore further