datarekha
Statistics & Probability Easy Asked at GoogleAsked at MicrosoftAsked at Stripe

What is the difference between variance and standard deviation, and why do we need both?

The short answer

Variance is the average squared deviation from the mean; standard deviation is its square root and lives in the same units as the data. Variance is mathematically tractable — variances of independent variables add — while standard deviation is interpretable as a typical distance from the mean.

How to think about it

Lead with the formulas, then explain the reason each form exists — interviewers want to see that you understand the unit-squaring trade-off and the additivity property.

Definitions

For a population of N values with mean μ:

Var(X) = (1/N) * sum((x_i - mu)^2)

SD(X) = sqrt(Var(X))

For a sample of n values, replace N with n - 1 (Bessel’s correction) to get an unbiased estimate of the population variance.

Why square the deviations?

Squaring does two things: it makes all deviations positive (unlike absolute deviations), and it penalises large deviations disproportionately, which aligns with many loss functions (least squares, Gaussian likelihood). The penalty for doubling a deviation is four times larger, not two times.

Why take the square root?

If your data are in dollars, variance is in dollars-squared — an uninterpretable unit. Standard deviation restores the original unit and has a direct meaning: for roughly bell-shaped data, most observations fall within one or two standard deviations of the mean.

The additive property — why variance matters in theory

For independent random variables X and Y:

Var(X + Y) = Var(X) + Var(Y)

This does not hold for standard deviations. That additivity is what makes variance the natural quantity in probability theory, error propagation, and portfolio analysis. Standard deviation is for communication; variance is for computation.

Population vs sample

The n - 1 denominator in sample variance corrects for the fact that sample deviations are measured from the sample mean rather than the true mean, which pulls them slightly inward. This is Bessel’s correction. Most software defaults to n - 1 (unbiased), but numpy’s np.var defaults to n — specify ddof=1 for the sample version.

Learn it properly Distributions you should know

Keep practising

All Statistics & Probability questions

Explore further

Skip to content