datarekha

Central Limit Theorem & Confidence Intervals

Average enough independent samples and the result is Normal — whatever the original shape. That single fact powers GATE's NAT questions on sums, proportions, and confidence intervals.

9 min read Intermediate GATE DA Lesson 16 of 122

What you'll learn

  • The sample mean of n iid variables has mean μ and variance σ²/n, so its SD is σ/√n
  • The Central Limit Theorem: for large n the sample mean (or sum) is approximately Normal, whatever the population shape
  • Standardising a sample mean or sum and reading the answer off the Φ table
  • A confidence interval for a known-σ mean: x̄ ± z·(σ/√n), with z = 1.96 for 95%

Before you start

Here is one of the strangest facts in all of probability. Take any population — skewed, discrete, lopsided, doesn’t matter — draw a sample, and compute the average. Repeat that a few thousand times. The histogram of those averages always pulls itself into a clean bell shape. The underlying data can look like anything; the act of averaging is what manufactures the normal curve. That fact is the Central Limit Theorem, and it is the engine behind every GATE question that hands you a sum or a sample mean and expects you to standardise and read Phi like the previous lesson.

The sample mean: mean μ, variance σ²/n

Take n independent draws X₁, …, Xₙ from a population with mean μ and variance σ². Their average is the sample mean x̄ = (X₁ + … + Xₙ) / n. Two facts hold for any population, even before the CLT:

  • Mean of x̄ is μ — averaging does not shift the centre.
  • Variance of x̄ is σ²/n — averaging shrinks the spread. So the standard deviation of the sample mean is σ/√n (the “standard error”).

More data gives a tighter estimate, and it tightens like √n, not like n. To halve the standard error you need four times the data.

The Central Limit Theorem

Distribution of the sample mean as n growsn = 1 (skewed)n = 5n = 30 (bell)Same lopsided population — averaging more draws pulls the shape toward a Normal.
The CLT in one picture: the population can be any shape; the sample mean tends to Normal as n grows.

Central Limit Theorem. For independent, identically distributed X₁, …, Xₙ with mean μ and finite variance σ², as n grows the sample mean is approximately

x̄  ≈  Normal( μ ,  σ²/n )        equivalently       sum  ≈  Normal( nμ , nσ² )

regardless of the population’s original distribution. Once you accept that, every question becomes the same move you learned for the Normal: standardise, then read Φ.

                 value − mean              x̄ − μ                  sum − nμ
        z  =  ────────────────      =   ───────────       =     ────────────
                  std. dev                 σ/√n                    √(nσ²)

where Φ(z) = P(Z ≤ z) is the standard-normal table. A rule of thumb: n ≥ 30 is “large enough” for the approximation in most GATE problems.

Try it. Pick the most lopsided base population you can find — the exponential, the bimodal — and start drawing samples. At n = 1 the histogram of “sample means” still looks like the base population. Bump n up and watch the shape straighten into a bell, hugging N(μ, σ/√n) whatever the population was. That shrinking standard error σ/√n is exactly why bigger samples give tighter estimates — and why a 95% confidence interval gets narrower as n grows.

Confidence intervals (known σ)

The CLT also tells you how trustworthy an estimate is. If is approximately Normal(μ, σ²/n), then μ lies within a predictable band around . A confidence interval for the mean, when the population σ is known, is

x̄  ±  z · (σ/√n)

The multiplier z comes from the standard normal and sets the confidence level:

  • 90% confidence: z = 1.645
  • 95% confidence: z = 1.96
  • 99% confidence: z = 2.576

A higher confidence level needs a bigger z, which makes the interval wider — being more sure costs precision. The interval also narrows as n grows, again like √n.

How GATE asks this

A NAT or MCQ. The classic pattern hands you a sum or proportion built from many iid pieces, tells you to treat it as Normal, and either gives you a Φ value (often Φ(2) ≈ 0.9772 or Φ(1) ≈ 0.8413) to reach a decimal, or — as in GATE DA 2025 — asks you to pick the right Φ expression from four options. Either way you compute the mean and variance of the sum, standardise the endpoints, and combine the Φ readings; the 2025 sum-of-300-Bernoulli question is worked below. A close cousin asks for a 95% confidence interval given , σ, and n, expecting x̄ ± 1.96·(σ/√n).

Worked example — a real GATE DA 2025 question

Let Y be the sum of 300 independent Bernoulli(0.25) random variables (each is 1 with probability 0.25, else 0). Using the normal approximation, P(60 ≤ Y ≤ 90) equals which of Φ(2) − Φ(−2), Φ(1) − Φ(−1), Φ(3) − Φ(−3), or Φ(90) − Φ(60)? (We also evaluate it numerically with Φ(2) ≈ 0.9772.)

Step 1 — mean and variance of the sum. A single Bernoulli(p) has mean p and variance p(1−p). For the sum of 300 of them:

mean      = 300 · 0.25            = 75
variance  = 300 · 0.25 · 0.75     = 56.25
std. dev  = √56.25                = 7.5

Step 2 — standardise both endpoints. Subtract the mean and divide by 7.5:

lower:  z = (60 − 75) / 7.5  =  −15 / 7.5  =  −2
upper:  z = (90 − 75) / 7.5  =   15 / 7.5  =  +2

Step 3 — read the Φ table. Using the symmetry Φ(−2) = 1 − Φ(2):

P(60 ≤ Y ≤ 90)  =  Φ(2) − Φ(−2)
                =  0.9772 − (1 − 0.9772)
                =  0.9772 − 0.0228
                =  0.9544

So the answer is the expression Φ(2) − Φ(−2), which evaluates to ≈ 0.9544. This is the real GATE DA 2025 question (an MCQ over the four Φ-expressions) — the whole solution is “mean, variance, standardise, subtract two Φ values.”

Quick check

Quick check

0/5
Q1A population has mean μ = 50 and standard deviation σ = 12. For a sample of n = 36, what is the standard deviation of the sample mean x̄ (the standard error)? (1 decimal)numerical answer — type a number
Q2The weekly demand has mean 500 and SD 60. Over n = 16 independent weeks, the sample mean demand x̄ is treated as Normal. Using Φ(1) ≈ 0.8413, find P(x̄ > 515). (3 decimals)numerical answer — type a number
Q3A 95% confidence interval for a mean uses z = 1.96, with x̄ = 100, σ = 20, n = 100. What is the half-width (margin of error), z·σ/√n? (2 decimals)numerical answer — type a number
Q4Which statements about the CLT and confidence intervals are TRUE? (select all that apply)select all that apply
Q5Why can a sum of 300 Bernoulli(0.25) trials be approximated by a Normal distribution?

Practice this in an interview

All questions
What does the Central Limit Theorem actually say, and why does it matter?

The CLT states that the sampling distribution of the sample mean converges to a normal distribution as sample size grows, regardless of the shape of the underlying population distribution. It is the theoretical foundation for confidence intervals, hypothesis tests, and many machine-learning approximations — but it applies to the distribution of the mean, not to the raw data.

What makes the Normal distribution so central in statistics, and when does it fail?

The Normal distribution is justified by the Central Limit Theorem — averages of large i.i.d. samples converge to Normal regardless of the underlying distribution. It is fully characterized by mean and variance, enabling closed-form inference. It fails for heavy-tailed data, skewed outcomes, bounded quantities, and rare extreme events.

What is the Law of Large Numbers and how does it differ from the Central Limit Theorem?

The Law of Large Numbers (LLN) says the sample mean converges to the true mean as sample size grows — it is a statement about where the mean lands. The Central Limit Theorem says the sampling distribution of the mean is approximately normal — it is a statement about the shape of that distribution. LLN guarantees convergence; CLT characterises the rate and shape of that convergence.

What is the correct interpretation of a 95% confidence interval?

A 95% confidence interval means that if you repeated the sampling procedure many times and built an interval each time, 95% of those intervals would contain the true parameter. It does not mean there is a 95% probability that this specific interval contains the parameter.

Sign in to track your progress

Completed lessons, your XP, level, and streak save to your account — it's free and takes a few seconds.

Explore further

Related lessons

Skip to content