datarekha
Machine Learning Medium Asked at NetflixAsked at AirbnbAsked at Two Sigma

What problem does ElasticNet solve that neither Lasso nor Ridge can handle alone?

The short answer

When predictors are highly correlated, Lasso tends to arbitrarily pick one and discard the others, producing unstable feature selection. Ridge retains all correlated features but cannot zero any out. ElasticNet combines both penalties to achieve stable, sparse solutions — it groups correlated features and can shrink the whole group together.

How to think about it

The Lasso failure mode under collinearity:

If features x₁ and x₂ are nearly identical, Lasso’s L1 constraint has multiple solutions of equal loss. In practice, Lasso selects one arbitrarily (often based on which enters the coordinate-descent path first) and discards the other. Run the model on a slightly different sample and you may get a completely different set of selected features — highly unstable.

ElasticNet loss:

L = ||y - Xβ||² + λ[α||β||₁ + (1-α)||β||²]

Here α ∈ [0,1] is the l1_ratio in sklearn — it blends the two penalties.

  • α = 1 → pure Lasso
  • α = 0 → pure Ridge
  • α ∈ (0,1) → ElasticNet

The L2 component couples correlated features, encouraging their coefficients to be close in magnitude (the “grouping effect”). The L1 component still drives some toward zero.

Practical selection of hyperparameters:

from sklearn.linear_model import ElasticNetCV

# Cross-validated search over alpha and l1_ratio simultaneously
enet = ElasticNetCV(
    l1_ratio=[0.1, 0.5, 0.7, 0.9, 0.95, 1.0],
    alphas=[0.001, 0.01, 0.1, 1.0],
    cv=5
)
enet.fit(X_train, y_train)
print(f"Best l1_ratio: {enet.l1_ratio_}, alpha: {enet.alpha_}")
print(f"Nonzero features: {(enet.coef_ != 0).sum()}")

Grouping effect — intuition: if two features are perfectly correlated, ElasticNet will assign them equal nonzero coefficients or zero both out together. Lasso would zero out one and keep the other with double the coefficient.

Learn it properly L1, L2, Elastic Net

Keep practising

All Machine Learning questions

Explore further

Skip to content