Normal & Standard Normal
The bell curve and its standardized twin. Standardize with z, read a Phi table, and the 68-95-99.7 rule does the rest — the workhorse behind the CLT and z-tests.
What you'll learn
- N(mu, sigma squared): the bell curve, symmetric about its mean mu
- Standardization Z = (X - mu)/sigma turns any normal into the standard normal N(0,1)
- Reading a Phi table, and the symmetry Phi(-z) = 1 - Phi(z)
- The 68-95-99.7 rule for values within 1, 2, and 3 standard deviations
Before you start
Heights of people. Errors in a measurement. Scores on a long test. Stand back and squint at any of them and you’ll see the same shape: a hump in the middle, thinning smoothly on both sides. That hump is the normal distribution — the bell curve — and it shows up so reliably that it underpins z-tests, the Central Limit Theorem, and most of the machine learning you’ll meet later.
You will never integrate the bell curve by hand on the exam. The whole trick is to standardize any normal back to one fixed reference curve, then look up the answer in a Phi table. Master that single move and this entire topic becomes plug-and-play.
N(mu, sigma squared) and the standard normal
A normal is fixed by two numbers: its mean mu (where the peak sits) and its
variance sigma² (how wide it spreads). The special case mu = 0, sigma = 1 is
the standard normal Z ~ N(0, 1).
To convert any normal to the standard one, subtract the mean and divide by the standard deviation:
Z = (X − mu) / sigma
Now Z is a standard normal, and a Z value (a z-score) just says “how many
standard deviations from the mean.” Everything reduces to questions about Z.
Reading a Phi table
Phi(z) is the cumulative standard-normal function: the area to the left of
z, i.e. Phi(z) = P(Z ≤ z). A Phi table lists these areas. To find P(X ≤ a):
standardize a to a z-score, then read Phi(z) off the table.
Two values worth memorizing: Phi(1) ≈ 0.8413 and Phi(2) ≈ 0.9772.
Because the bell curve is symmetric about 0, the left tail mirrors the right:
Phi(-z) = 1 - Phi(z)
So Phi(−1) = 1 − 0.8413 = 0.1587. Tables usually only print positive z, so this
identity is how you handle negatives.
Open the Normal tab below. Drag the two handles to bracket an interval and
the shaded area is the probability — that area is exactly the Phi(z_b) − Phi(z_a)
you would otherwise look up in a table. Slide the mean and standard deviation
to see how the same bell shape sits at any location with any width: that is
why one standardised reference curve is enough for all of them.
How GATE asks this
Reliably a NAT or MCQ: a normal with given mu and sigma, a target value or
interval, and Phi values supplied (or expected from memory). You standardize, look
up Phi, and combine. Interval problems use P(a ≤ X ≤ b) = Phi(z_b) − Phi(z_a). This
appears in essentially every paper, sometimes wrapped inside a CLT question.
Worked example
Let
X ~ N(50, 10²), somu = 50andsigma = 10. FindP(X ≤ 60)andP(40 ≤ X ≤ 60). UsePhi(1) ≈ 0.8413.
Step 1 — standardize the upper bound.
z = (60 - 50) / 10 = 1
P(X <= 60) = Phi(1) = 0.8413
Step 2 — the interval. Standardize both ends, then subtract:
z_low = (40 - 50)/10 = -1
z_high = (60 - 50)/10 = +1
P(40 <= X <= 60) = Phi(1) - Phi(-1)
= 0.8413 - (1 - 0.8413)
= 0.8413 - 0.1587
= 0.6826
So P(X ≤ 60) ≈ **0.8413** and P(40 ≤ X ≤ 60) ≈ **0.6826**. That second answer is
exactly the 68% band — 40 and 60 sit one standard deviation either side of the
mean, so the 68-95-99.7 rule predicts it without a single lookup.
Quick check
Quick check
Practice this in an interview
All questionsThe normal distribution is a symmetric, bell-shaped probability distribution completely described by its mean and standard deviation. The empirical rule states that approximately 68%, 95%, and 99.7% of observations fall within one, two, and three standard deviations of the mean respectively — a direct consequence of integrating the Gaussian density over those intervals.
The Normal distribution is justified by the Central Limit Theorem — averages of large i.i.d. samples converge to Normal regardless of the underlying distribution. It is fully characterized by mean and variance, enabling closed-form inference. It fails for heavy-tailed data, skewed outcomes, bounded quantities, and rare extreme events.
The CLT states that the sampling distribution of the sample mean converges to a normal distribution as sample size grows, regardless of the shape of the underlying population distribution. It is the theoretical foundation for confidence intervals, hypothesis tests, and many machine-learning approximations — but it applies to the distribution of the mean, not to the raw data.
Each distribution has a natural generative story: Bernoulli is a single coin flip; Binomial sums Bernoullis; Poisson counts rare arrivals; Normal emerges from sums of many small effects; Exponential models waiting times between Poisson events; Uniform assigns equal probability across a range. Choosing correctly comes from matching that story to the data-generating process.