datarekha
Statistics & Probability Easy Asked at AmazonAsked at MetaAsked at Google

When should you use mean vs median vs mode, and which is most robust to outliers?

The short answer

Mean is optimal for symmetric, outlier-free data; median is the go-to for skewed distributions or when outliers are real rather than errors; mode is the only sensible average for nominal/categorical data. Robustness is a formal concept — the median's breakdown point is 50%, meaning half the data can be corrupted before it fails, while the mean's breakdown point is essentially 0%.

How to think about it

State which measure, say why in terms of the data-generating process, then mention the breakdown-point framing — that signals you know the formal definition of robustness, not just the vague intuition.

When each measure applies

Mean uses every value in its computation, so it extracts maximum information when data are roughly symmetric with light tails. It is the maximum-likelihood estimator for the center of a Gaussian. Salary at a small company before and after hiring one billionaire illustrates why it can mislead: one extreme point shifts it far from the typical employee’s pay.

Median is the 50th percentile; it ignores magnitude and cares only about rank. It is preferred for right-skewed distributions (income, house prices, response times), for Likert scales, and whenever outliers reflect the real world rather than measurement error — you still want to summarize the bulk of the distribution.

Mode is the most frequent value. It is the only meaningful “average” for nominal categories (e.g., “most common browser”). For continuous data a mode is rarely stable, but kernel density modes — local maxima of a density estimate — are used to identify multimodal structure.

Breakdown point

The breakdown point of an estimator is the fraction of data that can be replaced by arbitrary values before the estimator becomes unbounded. Mean: ≈ 0 (a single infinite value corrupts it). Median: 0.50. Trimmed means sit in between and are often a practical compromise when tails exist but you still want to leverage magnitude.

Quick decision rule

Data typeRecommended
Symmetric, no outliersMean
Skewed or heavy-tailedMedian
Nominal / categoricalMode
Ordinal with outliersMedian
Learn it properly Distributions you should know

Keep practising

All Statistics & Probability questions

Explore further

Skip to content