Statistics & Probability Easy Asked at AmazonAsked at AirbnbAsked at Microsoft

What is the difference between standard error and standard deviation?

For Data Scientist Data Analyst ML Engineer

The short answer

Standard deviation measures the spread of individual observations around the population mean. Standard error measures the spread of sample means around the true mean — it equals the standard deviation divided by the square root of the sample size, so it shrinks as the sample grows while the standard deviation does not.

How to think about it

The confusion between these two is one of the most common statistical errors in industry. Nail the conceptual distinction, the formula, and where each belongs in a report.

Standard deviation (SD)

SD = sigma = sqrt( Var(X) )

SD describes variability in the data itself. It answers: “How spread out are individual observations?” If you measure the heights of 1,000 people, the SD of that sample tells you how far a typical person’s height deviates from the average. Adding more people to the sample does not materially change the SD — it converges to the population SD, but it does not shrink toward zero.

Standard error (SE)

SE = sigma / sqrt(n) ≈ s / sqrt(n)

where s is the sample SD (used when σ is unknown).

SE describes variability of the sample mean as an estimator. It answers: “If we repeated this study many times, how much would the computed mean bounce around?” By the Central Limit Theorem, the sample mean X̄ has distribution approximately N(mu, sigma^2 / n), so its standard deviation is sigma / sqrt(n).

The key relationship

SE = SD / sqrt(n)

Double the sample size: SE drops by a factor of sqrt(2) ≈ 1.41. To halve the SE, you need four times as many observations. This is the diminishing-returns cost of precision in statistics.

When to use each

Quantity	When to use
Standard deviation	Describe spread in a dataset or population
Standard error	Describe precision of an estimate (mean, proportion, coefficient)
SE in confidence intervals	`X_bar ± z * SE` or `X_bar ± t * SE`
SD in error bars on raw data	Show data variability, not estimation uncertainty

Common mistake in reporting

Error bars in plots are ambiguous. Always label whether they represent ±1 SD (data spread) or ±1 SE (estimation precision). SE-based error bars are narrower and give a false impression of data concentration if readers expect SD.

Learn it properly Central limit theorem