Explain the normal distribution and the 68-95-99.7 empirical rule.
The normal distribution is a symmetric, bell-shaped probability distribution completely described by its mean and standard deviation. The empirical rule states that approximately 68%, 95%, and 99.7% of observations fall within one, two, and three standard deviations of the mean respectively — a direct consequence of integrating the Gaussian density over those intervals.
How to think about it
Explain the shape, give the rule, then tell the interviewer when it’s useful and when it breaks down — that last part separates a prepared candidate from one who memorised a formula.
The Gaussian density
A random variable X is normally distributed with mean μ and standard deviation σ if its probability density is:
f(x) = (1 / (sigma * sqrt(2*pi))) * exp( -(x - mu)^2 / (2 * sigma^2) )
Two parameters completely determine its shape. Mean controls location; standard deviation controls spread. The distribution is symmetric about μ, so mean = median = mode.
The 68-95-99.7 rule
| Interval | Probability |
|---|---|
| μ ± 1σ | ≈ 68.3% |
| μ ± 2σ | ≈ 95.4% |
| μ ± 3σ | ≈ 99.7% |
These come from integrating the standard normal CDF: P(-1 < Z < 1) ≈ 0.683.
Practical uses
A data point 2σ above the mean is in roughly the top 2.3% of the distribution — useful for anomaly detection thresholds. In manufacturing (Six Sigma), ±3σ bounds represent a defect rate of 0.3%, and ±6σ is approximately 3.4 defects per million.
When normality fails
The rule applies to the normal distribution. Heavy-tailed distributions (financial returns, internet traffic) place far more probability mass beyond ±3σ than the 0.3% the rule implies. Blindly applying the rule to non-normal data underestimates tail risk.