Machine Learning Medium Asked at GoogleAsked at AmazonAsked at Apple

What is the difference between parametric and non-parametric models?

For Data Scientist ML Engineer AI / LLM Engineer

The short answer

Parametric models have a fixed number of parameters determined before training regardless of dataset size; non-parametric models let the number of effective parameters grow with data. Parametric models are faster and more data-efficient under correct assumptions; non-parametric models are more flexible but require larger datasets and more memory.

How to think about it

The distinction is whether the model’s complexity is fixed before training or grows with data.

Parametric models — the model is fully described by a fixed-size parameter vector θ. Once trained, the data can be discarded; predictions depend only on θ. Training typically involves optimizing a closed-form or gradient-based objective.

Examples:

Linear / logistic regression: d+1 parameters for d features.
Neural networks: weights fixed by architecture; data can be discarded after training.
Naive Bayes: fixed set of class-conditional statistics.

Non-parametric models — the number of parameters is not fixed in advance; it typically scales with n (the number of training examples).

Examples:

k-Nearest Neighbors: the entire training set is the model; memory = O(n).
Kernel SVM: the support vectors grow with data complexity.
Gaussian processes: covariance matrix is n x n.
Decision trees with no depth limit: number of leaf nodes grows with data.

Property	Parametric	Non-parametric
Complexity	Fixed	Grows with n
Memory after training	O(params)	O(n)
Assumptions	Functional form assumed	Minimal
Small data	Often better	Risk of noise
Large data	May underfit	Can model complex patterns
Inference speed	Fast (matrix multiply)	Can be slow (O(n) lookup for k-NN)

Semi-parametric models blend both: a Cox proportional hazards model has a parametric hazard ratio but a non-parametric baseline hazard.

# Parametric: logistic regression stores only weights
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression().fit(X_train, y_train)
# lr.coef_ has shape (1, n_features) — fixed size

# Non-parametric: k-NN stores all training points
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=5).fit(X_train, y_train)
# knn._fit_X has shape (n_train, n_features) — grows with data

What is the difference between parametric and non-parametric models?

Keep practising

Explore further