Machine Learning Medium Asked at AmazonAsked at Palantir

How do hierarchical clustering and DBSCAN differ from k-means?

For Data Scientist ML Engineer AI / LLM Engineer

The short answer

Hierarchical clustering builds a tree of nested merges or splits and does not require specifying k upfront, but it is O(n² log n) and cannot revise early decisions. DBSCAN finds arbitrarily shaped clusters by density reachability, naturally marks outliers as noise, and also needs no k — but its results are sensitive to the eps and min_samples hyperparameters.

How to think about it

K-means, hierarchical, and DBSCAN are three distinct philosophies. Pick the right one based on data shape, scale, and whether outlier detection matters.

Hierarchical clustering

Agglomerative (bottom-up): start with each point as its own cluster and greedily merge the pair that minimises a linkage criterion — complete (max distance), average, or Ward (minimises inertia increase). The result is a dendrogram that encodes every possible k simultaneously; you choose k by cutting the tree at a height.

Strengths: no k needed upfront; the dendrogram reveals nested structure; works with any distance metric.

Weaknesses: O(n² log n) time and O(n²) memory — painful above ~50 k points; early merges are irreversible, so one bad merge propagates.

DBSCAN

DBSCAN labels points as:

Core: has at least min_samples neighbours within radius eps.
Border: within eps of a core point but not itself core.
Noise: neither — these become the -1 label.

Clusters grow by density reachability: core points that are within eps of each other belong to the same cluster. Clusters can be any shape.

Strengths: discovers arbitrary shapes; built-in outlier detection; no k needed.

Weaknesses: two hyperparameters (eps, min_samples) that are hard to set; fails when clusters have varying density (HDBSCAN fixes this); performance degrades in high dimensions because distance concentration makes eps meaningless.

Quick comparison

Property	k-means	Hierarchical	DBSCAN
Needs k	Yes	No (cut later)	No
Cluster shape	Convex	Any (with right linkage)	Any
Outliers	Forced into cluster	Forced into cluster	Labeled as noise
Scale	O(nk)	O(n² log n)	O(n log n) with index

from sklearn.cluster import AgglomerativeClustering, DBSCAN

# Hierarchical — choose k=3 after inspecting dendrogram
hc = AgglomerativeClustering(n_clusters=3, linkage="ward")
labels_hc = hc.fit_predict(X)

# DBSCAN — tune eps with a k-distance plot
db = DBSCAN(eps=0.5, min_samples=5)
labels_db = db.fit_predict(X)  # -1 = noise

How do hierarchical clustering and DBSCAN differ from k-means?

Hierarchical clustering

DBSCAN

Quick comparison

Keep practising

Explore further