When should you use mean vs median vs mode, and which is most robust to outliers?
Mean is optimal for symmetric, outlier-free data; median is the go-to for skewed distributions or when outliers are real rather than errors; mode is the only sensible average for nominal/categorical data. Robustness is a formal concept — the median's breakdown point is 50%, meaning half the data can be corrupted before it fails, while the mean's breakdown point is essentially 0%.
How to think about it
State which measure, say why in terms of the data-generating process, then mention the breakdown-point framing — that signals you know the formal definition of robustness, not just the vague intuition.
When each measure applies
Mean uses every value in its computation, so it extracts maximum information when data are roughly symmetric with light tails. It is the maximum-likelihood estimator for the center of a Gaussian. Salary at a small company before and after hiring one billionaire illustrates why it can mislead: one extreme point shifts it far from the typical employee’s pay.
Median is the 50th percentile; it ignores magnitude and cares only about rank. It is preferred for right-skewed distributions (income, house prices, response times), for Likert scales, and whenever outliers reflect the real world rather than measurement error — you still want to summarize the bulk of the distribution.
Mode is the most frequent value. It is the only meaningful “average” for nominal categories (e.g., “most common browser”). For continuous data a mode is rarely stable, but kernel density modes — local maxima of a density estimate — are used to identify multimodal structure.
Breakdown point
The breakdown point of an estimator is the fraction of data that can be replaced by arbitrary values before the estimator becomes unbounded. Mean: ≈ 0 (a single infinite value corrupts it). Median: 0.50. Trimmed means sit in between and are often a practical compromise when tails exist but you still want to leverage magnitude.
Quick decision rule
| Data type | Recommended |
|---|---|
| Symmetric, no outliers | Mean |
| Skewed or heavy-tailed | Median |
| Nominal / categorical | Mode |
| Ordinal with outliers | Median |