datarekha

DBSCAN & hierarchical clustering

When k-means fails on non-blob shapes, density-based and hierarchical clustering succeed. DBSCAN finds arbitrary shapes and outliers with no preset k; hierarchical builds a tree you cut at any level.

7 min read Intermediate Machine Learning Lesson 27 of 33

What you'll learn

  • How DBSCAN grows clusters by density (core points, eps, minPts) and flags noise
  • Why it handles arbitrary shapes and needs no preset k
  • Hierarchical clustering and the dendrogram you cut at any level

Before you start

k-means is fast and popular, but it has three rigid assumptions: clusters are round, similarly sized, and you know how many there are. Break any of those and it fails. DBSCAN and hierarchical clustering are the shape-flexible alternatives that handle the cases k-means can’t.

DBSCAN — clustering by density

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) has a beautifully different idea: a cluster is a dense region of points. It uses two parameters:

  • eps — the neighborhood radius.
  • minPts — how many neighbors a point needs (within eps) to be a core point.

The algorithm grows clusters by chaining core points together, and any point that’s neither a core point nor reachable from one is labeled noise. Watch it trace two crescents that break k-means:

That gives DBSCAN three powers k-means lacks:

  1. Arbitrary shapes — it follows curves, rings, and blobs, because it grows by local density, not distance to a center.
  2. No preset k — it discovers the number of clusters from the data.
  3. Built-in outlier detection — sparse points become noise instead of being forced into a cluster.

Hierarchical clustering — a tree of clusters

The other shape-flexible approach builds a hierarchy. Agglomerative clustering starts with every point as its own cluster and repeatedly merges the two closest clusters until one remains, recording the whole sequence as a dendrogram — a tree. You then “cut” the tree at a height to get however many clusters you want.

The advantage: you don’t commit to k up front — you see the structure at every scale and pick the cut afterward. The cost: it’s slower (roughly O(n²) or worse), so it’s best for smaller datasets, and the linkage choice (how you measure distance between clusters — single, complete, average, Ward) changes the result.

Quick check

Quick check

0/3
Q1What makes a point a 'core point' in DBSCAN?
Q2Your data has two interleaved crescents. Why does DBSCAN succeed where k-means fails?
Q3What is a dendrogram in hierarchical clustering?

Next

To see high-dimensional clusters before you cluster them, project with t-SNE & UMAP. And to find the points that aren’t in any cluster, see anomaly detection.

Sign in to track your progress

Completed lessons, your XP, level, and streak save to your account — it's free and takes a few seconds.

Practice this in an interview

All questions
How do hierarchical clustering and DBSCAN differ from k-means?

Hierarchical clustering builds a tree of nested merges or splits and does not require specifying k upfront, but it is O(n² log n) and cannot revise early decisions. DBSCAN finds arbitrarily shaped clusters by density reachability, naturally marks outliers as noise, and also needs no k — but its results are sensitive to the eps and min_samples hyperparameters.

When would you use DBSCAN instead of k-means, and what are its main limitations?

Use DBSCAN when clusters have arbitrary, non-spherical shapes, when the number of clusters is unknown, and when you need to detect outliers, since it groups by density and labels low-density points as noise. Its main limitations are sensitivity to the eps and minPts parameters and difficulty when clusters have very different densities. It also struggles in high dimensions where distance becomes unreliable.

What are the main limitations of k-means clustering?

K-means requires specifying k upfront, assumes clusters are convex and roughly equal in size and density, is sensitive to outliers and feature scale, and can converge to local minima. It struggles with non-globular shapes such as rings or crescents, and it assigns every point to exactly one cluster with no notion of uncertainty.

How does hierarchical clustering work, and how do you decide the number of clusters from a dendrogram?

Agglomerative hierarchical clustering starts with each point as its own cluster and repeatedly merges the two closest clusters using a linkage rule (single, complete, average, or Ward) until one cluster remains, producing a dendrogram. You choose the number of clusters by cutting the dendrogram at a height where merges jump sharply, which indicates joining dissimilar groups. Unlike k-means it needs no preset k but is computationally expensive.

Related lessons

Explore further

Skip to content