Machine Learning Medium Asked at GoogleAsked at DeepMindAsked at Meta

What are t-SNE and UMAP, how do they differ from PCA, and what are their limitations for ML workflows?

For Data Scientist ML Engineer AI / LLM Engineer

The short answer

t-SNE and UMAP are nonlinear dimensionality reduction algorithms designed primarily for 2D/3D visualization of high-dimensional data. Unlike PCA, they preserve local neighborhood structure rather than global variance, producing cleaner cluster separations in plots. Neither should be used as a preprocessing step for training a supervised model because they are transductive and their output is not stable across runs.

How to think about it

If PCA is a projector that preserves global variance, t-SNE and UMAP are neighborhood maps: they ask “which points are near each other?” and try to replicate that neighborhood in 2D.

t-SNE

t-SNE (t-distributed Stochastic Neighbor Embedding) models pairwise similarity in high-dimensional space with a Gaussian kernel, then tries to match those similarities in 2D using a Student-t kernel. The heavy-tailed t kernel pushes dissimilar clusters apart, producing visually clean separations.

Characteristics:

Excellent for revealing cluster structure in visualization.
Not deterministic — different random seeds produce different layouts.
Does not preserve global distances: two clusters being far apart in the plot may or may not mean they are far apart in the original space.
Quadratic time complexity O(n²); slow for n > ~50,000 rows.
The perplexity hyperparameter (typical range 5–50) controls effective neighborhood size and strongly influences the output.

UMAP

UMAP (Uniform Manifold Approximation and Projection) constructs a weighted graph of nearest neighbors in high-dimensional space, then optimizes a low-dimensional embedding to preserve that graph structure.

Advantages over t-SNE:

Much faster, scales to millions of points.
Better preservation of global structure (cluster relative positions are more meaningful).
Supports transform() on new data — partially addresses t-SNE’s transductive limitation.
Fewer sensitive hyperparameters.

import umap
from sklearn.preprocessing import StandardScaler

X_scaled = StandardScaler().fit_transform(X)
reducer = umap.UMAP(n_components=2, n_neighbors=15, min_dist=0.1, random_state=42)
embedding = reducer.fit_transform(X_scaled)

# For new points (approximate):
new_embedding = reducer.transform(X_new_scaled)

PCA vs t-SNE vs UMAP

Property	PCA	t-SNE	UMAP
Linear	Yes	No	No
Global structure	Yes	Weak	Moderate
Speed	Fast	Slow	Fast
New-data transform	Yes	No	Approximate
Interpretable axes	Partially	No	No
Use in model pipeline	Yes	No	Caution

When to use each

Use PCA for preprocessing before supervised models. Use t-SNE or UMAP for exploratory visualization — inspecting cluster quality, detecting label overlap, or sanity-checking embeddings. Do not use t-SNE or UMAP as feature inputs for a supervised classifier: the embedding is not stable across datasets and cannot be faithfully applied to a held-out test set.

Learn it properly Curse of Dimensionality

What are t-SNE and UMAP, how do they differ from PCA, and what are their limitations for ML workflows?

t-SNE

UMAP

PCA vs t-SNE vs UMAP

When to use each

Keep practising

Explore further