datarekha
Machine Learning Medium Asked at GoogleAsked at AmazonAsked at Apple

What is the difference between parametric and non-parametric models?

The short answer

Parametric models have a fixed number of parameters determined before training regardless of dataset size; non-parametric models let the number of effective parameters grow with data. Parametric models are faster and more data-efficient under correct assumptions; non-parametric models are more flexible but require larger datasets and more memory.

How to think about it

The distinction is whether the model’s complexity is fixed before training or grows with data.

Parametric models — the model is fully described by a fixed-size parameter vector θ. Once trained, the data can be discarded; predictions depend only on θ. Training typically involves optimizing a closed-form or gradient-based objective.

Examples:

  • Linear / logistic regression: d+1 parameters for d features.
  • Neural networks: weights fixed by architecture; data can be discarded after training.
  • Naive Bayes: fixed set of class-conditional statistics.

Non-parametric models — the number of parameters is not fixed in advance; it typically scales with n (the number of training examples).

Examples:

  • k-Nearest Neighbors: the entire training set is the model; memory = O(n).
  • Kernel SVM: the support vectors grow with data complexity.
  • Gaussian processes: covariance matrix is n x n.
  • Decision trees with no depth limit: number of leaf nodes grows with data.
PropertyParametricNon-parametric
ComplexityFixedGrows with n
Memory after trainingO(params)O(n)
AssumptionsFunctional form assumedMinimal
Small dataOften betterRisk of noise
Large dataMay underfitCan model complex patterns
Inference speedFast (matrix multiply)Can be slow (O(n) lookup for k-NN)

Semi-parametric models blend both: a Cox proportional hazards model has a parametric hazard ratio but a non-parametric baseline hazard.

# Parametric: logistic regression stores only weights
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression().fit(X_train, y_train)
# lr.coef_ has shape (1, n_features) — fixed size

# Non-parametric: k-NN stores all training points
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=5).fit(X_train, y_train)
# knn._fit_X has shape (n_train, n_features) — grows with data

Keep practising

All Machine Learning questions

Explore further

Skip to content