What does the Central Limit Theorem actually say, and why does it matter?

The CLT states that the sampling distribution of the sample mean converges to a normal distribution as sample size grows, regardless of the shape of the underlying population distribution. It is the theoretical foundation for confidence intervals, hypothesis tests, and many machine-learning approximations — but it applies to the distribution of the mean, not to the raw data.

What makes the Normal distribution so central in statistics, and when does it fail?

The Normal distribution is justified by the Central Limit Theorem — averages of large i.i.d. samples converge to Normal regardless of the underlying distribution. It is fully characterized by mean and variance, enabling closed-form inference. It fails for heavy-tailed data, skewed outcomes, bounded quantities, and rare extreme events.

What is the Law of Large Numbers and how does it differ from the Central Limit Theorem?

The Law of Large Numbers (LLN) says the sample mean converges to the true mean as sample size grows — it is a statement about where the mean lands. The Central Limit Theorem says the sampling distribution of the mean is approximately normal — it is a statement about the shape of that distribution. LLN guarantees convergence; CLT characterises the rate and shape of that convergence.

What is the correct interpretation of a 95% confidence interval?

A 95% confidence interval means that if you repeated the sampling procedure many times and built an interval each time, 95% of those intervals would contain the true parameter. It does not mean there is a 95% probability that this specific interval contains the parameter.

Central Limit Theorem & Confidence Intervals — GATE DA

Central Limit Theorem & Confidence Intervals

The strangest fact in probability, and the answer to where the bell keeps coming from: average enough samples from any population, however lopsided, and the averages pile into a Normal. That single fact powers GATE's NAT questions on sums, proportions, and confidence intervals.

9 min read Intermediate GATE DA Lesson 16 of 122

What you'll learn

The sample mean of n iid draws has mean μ, variance σ²/n, so SD = σ/√n

The CLT: for large n the sample mean is approximately Normal, whatever the population

Standardising a sample mean or sum and reading Φ

A confidence interval for a known-σ mean: x̄ ± z·(σ/√n), z = 1.96 for 95%

Here is one of the strangest facts in all of probability, and the answer to the question the last few lessons kept circling. Take any population you like — wildly skewed, lumpy, lopsided, it does not matter — draw a sample and compute its average. Do that a few thousand times and collect the averages. Their histogram always pulls itself into a clean bell. The underlying data can look like anything at all; the very act of averaging manufactures the normal curve.

The sample mean: centre μ, spread σ²/n

Take n independent draws X₁, …, Xₙ from a population with mean μ and variance σ². Their average is the sample mean x̄ = (X₁ + … + Xₙ)/n. Two facts hold for any population, even before the bell appears:

The mean of x̄ is μ — averaging does not move the centre.
The variance of x̄ is σ²/n — averaging shrinks the spread, so the standard deviation of the sample mean is σ/√n, called the standard error.

More data buys a tighter estimate, and it tightens like √n, not like n: to halve the standard error you need four times the data.

The Central Limit Theorem

The CLT in one picture: the population can be any shape; the sample mean tends to Normal as n grows.

The Central Limit Theorem says it precisely. For independent draws with mean μ and finite variance σ², as n grows the sample mean is approximately

x̄  ≈  Normal( μ , σ²/n )        equivalently       sum  ≈  Normal( nμ , nσ² )

regardless of the population’s own shape. Once you accept that, every such question becomes the move from the normal lesson — standardise, then read Φ:

        value − mean        x̄ − μ              sum − nμ
z  =  ───────────────  =  ─────────   =     ──────────────
         std. dev            σ/√n              √(nσ²)

A rule of thumb: n ≥ 30 is “large enough” for most GATE problems. Pick a lopsided base population in the simulator below — the exponential, the bimodal — and raise n: watch the histogram of sample means straighten into a bell that hugs σ/√n, narrowing as n climbs.

TryCLT · sample means

Average enough samples and you always get a bell

Pick a population — however lopsided — set a sample size n, then draw samples and watch their means pile up into a normal curve.

Heavy right tail. Strongly skewed, λ = 1.

Base populationμ = 1.00 · σ = 1.00

Sample size

n = 5

Distribution of sample meansN(μ, σ/√n) overlay

Press Draw a sample to drop one mean, or Draw 1000× to fill the histogram fast. The bars will climb to meet the bell — N(μ, σ/√n).

population μ1.000

population σ1.000

mean of means—

SD of means—

σ / √n0.447

samples0

As n rises, the bell narrows — its width is σ/√n, which shrinks like 1/√n. The SD of means you measure should track it.

Confidence intervals (known σ)

The CLT also says how much to trust an estimate. Since x̄ is approximately Normal(μ, σ²/n), the true mean μ lies within a predictable band around x̄. With the population σ known, a confidence interval for the mean is

x̄  ±  z · (σ/√n)

The multiplier z comes from the standard normal and fixes the confidence level: z = 1.645 for 90%, 1.96 for 95%, 2.576 for 99%. A higher confidence needs a bigger z, so the interval gets wider — being more sure costs precision — and it narrows as n grows, again like √n.

A worked example — a real GATE DA 2025 question

Let Y be the sum of 300 independent Bernoulli(0.25) variables. Using the normal approximation, find P(60 ≤ Y ≤ 90). Use Φ(2) ≈ 0.9772.

First the mean and variance of the sum — n copies of a Bernoulli(p), whose mean is p and variance p(1−p):

mean      = 300 · 0.25         = 75
variance  = 300 · 0.25 · 0.75  = 56.25
std. dev  = √56.25             = 7.5

Now standardise both ends and read Φ, using the symmetry Φ(−2) = 1 − Φ(2):

z_low  = (60 − 75)/7.5 = −2          z_high = (90 − 75)/7.5 = +2

P(60 ≤ Y ≤ 90) = Φ(2) − Φ(−2)
               = 0.9772 − (1 − 0.9772)
               = 0.9772 − 0.0228 = 0.9544

So ≈ 0.9544. A single Bernoulli is nothing like a Normal — it is only 0 or 1 — yet the CLT makes their sum approximately Normal, which is the whole reason standardise-and-read-Φ works here.

A question to carry forward

The confidence interval used the sample to estimate the mean. But often you do not want an estimate — you want a decision: has the new process really changed the average, or is the wobble just sampling noise? Here is the thread onward: how do you turn a sample into a verdict, and bound the chance of getting that verdict wrong?

In one breath

Sample mean x̄ of n iid draws: mean μ, variance σ²/n, so standard error = σ/√n (variance ÷ n, SD ÷ √n).
CLT: for large n (rule of thumb n ≥ 30), x̄ ≈ Normal(μ, σ²/n) and the sum ≈ Normal(nμ, nσ²) — whatever the population shape.
Then it is the normal lesson: standardise z = (x̄ − μ)/(σ/√n) and read Φ (2025: sum of 300 Bernoulli → Φ(2) − Φ(−2) ≈ 0.9544).
Confidence interval (known σ): x̄ ± z·(σ/√n) — z = 1.645 (90%), 1.96 (95%), 2.576 (99%); higher confidence → wider.
Precision tightens like √n: 4× the data halves the standard error.

Practice

Quick check

0/6

Q1Recall: the CLT says that, as n grows, the sample mean is approximately Normal…

Q2Trace: a population has mean 50 and SD 12. For a sample of n = 36, what is the standard error of x̄? (1 decimal)numerical answer — type a number

Q3Trace: weekly demand has mean 500 and SD 60. Over n = 16 weeks the sample mean is treated as Normal. Using Φ(1) ≈ 0.8413, find P(x̄ > 515). (3 decimals)numerical answer — type a number

Q4Apply: a 95% confidence interval uses z = 1.96, with x̄ = 100, σ = 20, n = 100. Find the margin of error z·σ/√n. (2 decimals)numerical answer — type a number

Q5Apply: which statements about the CLT and confidence intervals are TRUE? (select all that apply)select all that apply

Q6Create: Y is the sum of 300 Bernoulli(0.25). Find its mean and SD, then explain why Y can be treated as Normal.

Central Limit Theorem & Confidence Intervals

What you'll learn

Before you start

The sample mean: centre μ, spread σ²/n

The Central Limit Theorem

Average enough samples and you always get a bell

Confidence intervals (known σ)

A worked example — a real GATE DA 2025 question

A question to carry forward

In one breath

Practice

Quick check

Sign in to track your progress

Practice this in an interview

Related lessons

Explore further