Machine Learning Easy Asked at AmazonAsked at Google

How does k-nearest neighbours work, and why is it called a lazy learner?

For Data Scientist ML Engineer AI / LLM Engineer Data Analyst

The short answer

KNN stores the entire training set and defers all computation to prediction time: for a new point it finds the k closest training examples by distance, then returns the majority class (classification) or mean value (regression). It is called lazy because there is no training phase — the model is the data itself.

How to think about it

KNN is one of the cleanest examples of instance-based learning. The key interview points are why laziness is both a strength and a scalability liability, and how k controls the bias-variance tradeoff.

How it works

At prediction time for a query point x:

Compute the distance from x to every training point (Euclidean by default; Manhattan or Minkowski also common).
Identify the k training points with the smallest distances.
For classification: return the majority label among those k neighbours (ties broken by distance weighting or arbitrarily).
For regression: return the mean (or distance-weighted mean) of the k neighbours’ target values.

Nothing is computed during “training” — scikit-learn’s fit() simply stores X and y (or builds a k-d tree / ball tree for faster lookup).

Why “lazy”

Lazy (instance-based) learners postpone generalisation. There is no compact model — no weights, no decision boundary equation. The entire training set is the model. This makes training O(1) but prediction O(n · d) naively, or O(d log n) with a k-d tree.

Contrast with eager learners (logistic regression, SVMs, neural networks) that distil training data into parameters; they are slow to train but fast to predict.

Choosing k

Small k (k=1): low bias, high variance — sensitive to noise; memorises the training set.
Large k: high bias, lower variance — decision boundary becomes smoother, but may blur class boundaries.
Choose k by cross-validation; odd k avoids ties in binary classification.

from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import cross_val_score
import numpy as np

best_k = max(range(1, 21), key=lambda k:
    cross_val_score(KNeighborsClassifier(n_neighbors=k), X, y, cv=5).mean()
)

knn = KNeighborsClassifier(n_neighbors=best_k, algorithm="auto")
knn.fit(X_train, y_train)

algorithm="auto" selects brute-force, k-d tree, or ball tree based on dimensionality and dataset size.

Learn it properly Curse of Dimensionality

How does k-nearest neighbours work, and why is it called a lazy learner?

How it works

Why “lazy”

Choosing k

Keep practising

Explore further