What's the difference between k-means and k-nearest neighbors? People confuse them.
K-means is an unsupervised clustering algorithm that partitions unlabeled data into k groups by iteratively updating centroids. KNN is a supervised algorithm that classifies or predicts a new point using the labels of its k closest training points. They share the letter k and the use of distances but solve completely different problems.
How to think about it
The crisp answer
They only share the letter k. K-means is unsupervised: it groups unlabeled data into k clusters. KNN is supervised: it predicts the label of a new point from the labels of its k nearest neighbors in the training set.
Why the confusion
Both use a distance metric and both have a hyperparameter called k, but k means different things. In k-means, k is the number of clusters you’re partitioning into. In KNN, k is how many neighbors vote on a prediction. The KNN vs K-means comparison is a frequent interview clarifier.
How each works
- K-means: initialize k centroids, assign each point to the nearest centroid, recompute centroids as cluster means, repeat until stable. There’s a training phase that produces centroids.
- KNN: no real training — it’s a lazy learner that stores the data. At prediction time it finds the k closest training points and takes a majority vote (classification) or average (regression).
Concrete example
Segmenting customers into 4 groups with no labels → k-means. Predicting whether a new customer churns based on the 5 most similar past customers → KNN.
The common trap
Saying KNN “trains” a model — it doesn’t; cost is at inference and scales with dataset size, which is its main weakness. And k-means doesn’t classify new points into known categories; it discovers structure. Both need feature scaling. Follow-up: “Which is lazy and which is eager?” — KNN is lazy (defers computation to query time), k-means does upfront work to learn centroids.