datarekha
Machine Learning Medium Asked at GoogleAsked at DeepMindAsked at Meta

What are t-SNE and UMAP, how do they differ from PCA, and what are their limitations for ML workflows?

The short answer

t-SNE and UMAP are nonlinear dimensionality reduction algorithms designed primarily for 2D/3D visualization of high-dimensional data. Unlike PCA, they preserve local neighborhood structure rather than global variance, producing cleaner cluster separations in plots. Neither should be used as a preprocessing step for training a supervised model because they are transductive and their output is not stable across runs.

How to think about it

If PCA is a projector that preserves global variance, t-SNE and UMAP are neighborhood maps: they ask “which points are near each other?” and try to replicate that neighborhood in 2D.

t-SNE

t-SNE (t-distributed Stochastic Neighbor Embedding) models pairwise similarity in high-dimensional space with a Gaussian kernel, then tries to match those similarities in 2D using a Student-t kernel. The heavy-tailed t kernel pushes dissimilar clusters apart, producing visually clean separations.

Characteristics:

  • Excellent for revealing cluster structure in visualization.
  • Not deterministic — different random seeds produce different layouts.
  • Does not preserve global distances: two clusters being far apart in the plot may or may not mean they are far apart in the original space.
  • Quadratic time complexity O(n²); slow for n > ~50,000 rows.
  • The perplexity hyperparameter (typical range 5–50) controls effective neighborhood size and strongly influences the output.

UMAP

UMAP (Uniform Manifold Approximation and Projection) constructs a weighted graph of nearest neighbors in high-dimensional space, then optimizes a low-dimensional embedding to preserve that graph structure.

Advantages over t-SNE:

  • Much faster, scales to millions of points.
  • Better preservation of global structure (cluster relative positions are more meaningful).
  • Supports transform() on new data — partially addresses t-SNE’s transductive limitation.
  • Fewer sensitive hyperparameters.
import umap
from sklearn.preprocessing import StandardScaler

X_scaled = StandardScaler().fit_transform(X)
reducer = umap.UMAP(n_components=2, n_neighbors=15, min_dist=0.1, random_state=42)
embedding = reducer.fit_transform(X_scaled)

# For new points (approximate):
new_embedding = reducer.transform(X_new_scaled)

PCA vs t-SNE vs UMAP

PropertyPCAt-SNEUMAP
LinearYesNoNo
Global structureYesWeakModerate
SpeedFastSlowFast
New-data transformYesNoApproximate
Interpretable axesPartiallyNoNo
Use in model pipelineYesNoCaution

When to use each

Use PCA for preprocessing before supervised models. Use t-SNE or UMAP for exploratory visualization — inspecting cluster quality, detecting label overlap, or sanity-checking embeddings. Do not use t-SNE or UMAP as feature inputs for a supervised classifier: the embedding is not stable across datasets and cannot be faithfully applied to a held-out test set.

Learn it properly Curse of Dimensionality

Keep practising

All Machine Learning questions

Explore further

Skip to content