What is the difference between parametric and non-parametric models?
Parametric models have a fixed number of parameters determined before training regardless of dataset size; non-parametric models let the number of effective parameters grow with data. Parametric models are faster and more data-efficient under correct assumptions; non-parametric models are more flexible but require larger datasets and more memory.
How to think about it
The distinction is whether the model’s complexity is fixed before training or grows with data.
Parametric models — the model is fully described by a fixed-size parameter vector θ. Once trained, the data can be discarded; predictions depend only on θ. Training typically involves optimizing a closed-form or gradient-based objective.
Examples:
- Linear / logistic regression:
d+1parameters fordfeatures. - Neural networks: weights fixed by architecture; data can be discarded after training.
- Naive Bayes: fixed set of class-conditional statistics.
Non-parametric models — the number of parameters is not fixed in advance; it typically scales with n (the number of training examples).
Examples:
- k-Nearest Neighbors: the entire training set is the model; memory = O(n).
- Kernel SVM: the support vectors grow with data complexity.
- Gaussian processes: covariance matrix is
n x n. - Decision trees with no depth limit: number of leaf nodes grows with data.
| Property | Parametric | Non-parametric |
|---|---|---|
| Complexity | Fixed | Grows with n |
| Memory after training | O(params) | O(n) |
| Assumptions | Functional form assumed | Minimal |
| Small data | Often better | Risk of noise |
| Large data | May underfit | Can model complex patterns |
| Inference speed | Fast (matrix multiply) | Can be slow (O(n) lookup for k-NN) |
Semi-parametric models blend both: a Cox proportional hazards model has a parametric hazard ratio but a non-parametric baseline hazard.
# Parametric: logistic regression stores only weights
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression().fit(X_train, y_train)
# lr.coef_ has shape (1, n_features) — fixed size
# Non-parametric: k-NN stores all training points
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=5).fit(X_train, y_train)
# knn._fit_X has shape (n_train, n_features) — grows with data