datarekha
Machine Learning Easy Asked at GoogleAsked at AmazonAsked at MetaAsked at Microsoft

What is the difference between supervised, unsupervised, and reinforcement learning?

The short answer

Supervised learning trains on labeled input-output pairs to predict a target. Unsupervised learning finds structure in unlabeled data. Reinforcement learning trains an agent to maximize cumulative reward through trial-and-error interaction with an environment.

How to think about it

The three paradigms differ on what signal drives learning.

Supervised learning — every training example carries a ground-truth label y. The model learns a mapping f(x) → y by minimizing a loss between predictions and labels. Examples: spam detection (label = spam/not-spam), house-price regression (label = sale price).

Unsupervised learning — no labels exist. The algorithm discovers latent structure: clusters (k-means), lower-dimensional representations (PCA, autoencoders), or density estimates. Example: segmenting customers by purchase behavior without pre-defined groups.

Reinforcement learning (RL) — an agent observes state s, takes action a, receives scalar reward r, and transitions to a new state. It learns a policy π(s) → a that maximizes expected cumulative (discounted) reward. No explicit target is given; the signal is delayed and sparse. Examples: AlphaGo, robotics control, recommendation systems tuned for engagement.

SupervisedLabeled dataX → Yclassification, regressionUnsupervisedUnlabeled dataX → structureclustering, densityReinforcementAgent + rewards,a → rpolicy optimization
Three ML paradigms differentiated by the training signal

A common hybrid is self-supervised learning (used in LLMs): labels are derived automatically from the data itself (e.g., next-token prediction), making it scale like unsupervised while training like supervised.

Learn it properly What ML actually is

Keep practising

All Machine Learning questions

Explore further

Skip to content