datarekha
Machine Learning Medium Asked at AmazonAsked at Palantir

How do hierarchical clustering and DBSCAN differ from k-means?

The short answer

Hierarchical clustering builds a tree of nested merges or splits and does not require specifying k upfront, but it is O(n² log n) and cannot revise early decisions. DBSCAN finds arbitrarily shaped clusters by density reachability, naturally marks outliers as noise, and also needs no k — but its results are sensitive to the eps and min_samples hyperparameters.

How to think about it

K-means, hierarchical, and DBSCAN are three distinct philosophies. Pick the right one based on data shape, scale, and whether outlier detection matters.

Hierarchical clustering

Agglomerative (bottom-up): start with each point as its own cluster and greedily merge the pair that minimises a linkage criterion — complete (max distance), average, or Ward (minimises inertia increase). The result is a dendrogram that encodes every possible k simultaneously; you choose k by cutting the tree at a height.

Strengths: no k needed upfront; the dendrogram reveals nested structure; works with any distance metric.

Weaknesses: O(n² log n) time and O(n²) memory — painful above ~50 k points; early merges are irreversible, so one bad merge propagates.

DBSCAN

DBSCAN labels points as:

  • Core: has at least min_samples neighbours within radius eps.
  • Border: within eps of a core point but not itself core.
  • Noise: neither — these become the -1 label.

Clusters grow by density reachability: core points that are within eps of each other belong to the same cluster. Clusters can be any shape.

Strengths: discovers arbitrary shapes; built-in outlier detection; no k needed.

Weaknesses: two hyperparameters (eps, min_samples) that are hard to set; fails when clusters have varying density (HDBSCAN fixes this); performance degrades in high dimensions because distance concentration makes eps meaningless.

Quick comparison

Propertyk-meansHierarchicalDBSCAN
Needs kYesNo (cut later)No
Cluster shapeConvexAny (with right linkage)Any
OutliersForced into clusterForced into clusterLabeled as noise
ScaleO(nk)O(n² log n)O(n log n) with index
from sklearn.cluster import AgglomerativeClustering, DBSCAN

# Hierarchical — choose k=3 after inspecting dendrogram
hc = AgglomerativeClustering(n_clusters=3, linkage="ward")
labels_hc = hc.fit_predict(X)

# DBSCAN — tune eps with a k-distance plot
db = DBSCAN(eps=0.5, min_samples=5)
labels_db = db.fit_predict(X)  # -1 = noise

Keep practising

All Machine Learning questions

Explore further

Skip to content