Gaussian mixture models
Soft clustering — every point gets a probability of belonging to each cluster, not a hard label. How GMMs model data as a mix of Gaussians, fit with EM, and generalize k-means.
What you'll learn
- GMMs as a probabilistic mixture of Gaussian clusters
- Soft assignment vs k-means' hard assignment
- The EM algorithm intuition and when GMMs beat k-means
Before you start
K-means forces every point into exactly one cluster and assumes those clusters are round, equal-sized blobs. Gaussian mixture models (GMMs) relax both: they model the data as a blend of Gaussian “bells,” give each point a probability of belonging to each one, and allow elongated, differently shaped clusters. GMM is the probabilistic generalization of k-means.
Soft assignment — the key difference
K-means says “point 7 is in cluster B.” A GMM says “point 7 is 70% cluster B, 30% cluster A.” For points deep inside a cluster that distinction barely matters — but for points in the overlap between clusters, the soft probability is far more honest than a coin-flip hard label.
GMMs also fit covariance per cluster — so clusters can be stretched, rotated ellipses, not just circles. That alone lets them fit data k-means mangles.
Fitting with EM
You can’t directly compute the clusters because you don’t know which point came from which Gaussian. The Expectation-Maximization (EM) algorithm solves this by alternating, much like k-means does:
- E-step — given the current Gaussians, compute each point’s probability of belonging to each (soft assignment).
- M-step — given those probabilities, update each Gaussian’s mean, covariance, and weight.
Repeat until it converges. In fact, k-means is a special case of EM with hard assignments and fixed spherical covariance — GMM is the fuller, probabilistic version.
Quick check
Quick check
Next
That completes the clustering family. To visualize any of these clusters in 2D, project with t-SNE & UMAP.
Practice this in an interview
All questionsA GMM models data as a mixture of Gaussian distributions and assigns soft probabilities of cluster membership, fitting clusters that can be elliptical and different sizes via the EM algorithm. K-means does hard assignment to the nearest centroid and implicitly assumes spherical, equal-size clusters. Prefer a GMM when clusters overlap, have different shapes or covariances, or when you need probabilistic (soft) assignments.
EM fits a GMM by alternating two steps: the E-step computes each point's responsibility (posterior probability) under each Gaussian using current parameters, and the M-step updates the means, covariances, and mixing weights to maximize the expected log-likelihood given those responsibilities. It iterates until the likelihood converges. Because the objective is non-convex, EM only reaches a local optimum, so initialization and multiple restarts matter.
K-means requires specifying k upfront, assumes clusters are convex and roughly equal in size and density, is sensitive to outliers and feature scale, and can converge to local minima. It struggles with non-globular shapes such as rings or crescents, and it assigns every point to exactly one cluster with no notion of uncertainty.
K-means partitions n points into k clusters by alternating between two steps: assigning each point to its nearest centroid, then recomputing each centroid as the mean of its assigned points. It repeats until assignments stop changing, which guarantees convergence but not a globally optimal solution.