What are overfitting and underfitting, and how do you fix each?
Overfitting occurs when a model memorizes training noise and fails to generalize; underfitting occurs when the model is too simple to capture the true signal. Fixes differ: overfitting requires regularization, more data, or reduced complexity; underfitting requires a more expressive model or better features.
How to think about it
Underfitting — training error is high because the model lacks capacity to represent the target function. A linear model fit to sinusoidal data is the canonical example.
Overfitting — training error is very low but validation/test error is high. The model has captured noise specific to the training set rather than the underlying distribution.
The gap between training and validation loss is the primary diagnostic:
- Large train error + large val error → underfit
- Low train error + large val error → overfit
- Low train error + low val error → good generalization
Fixes for overfitting:
- Regularization: L1 (Lasso), L2 (Ridge), dropout in neural nets
- Early stopping (monitor val loss, stop when it plateaus/rises)
- Reduce model complexity (fewer layers, lower polynomial degree)
- Get more training data or apply data augmentation
- Ensemble methods that average noisy models (bagging)
Fixes for underfitting:
- Increase model capacity (deeper network, higher-degree polynomial)
- Add informative features / feature engineering
- Reduce regularization strength
- Train longer / lower learning rate