Hypothesis Tests: z-test, t-test & chi-squared
A hypothesis test is a courtroom for data: assume the null, measure how surprising the evidence is, decide. GATE tests recognition — which test fits which situation.
What you'll learn
- Null vs alternative hypothesis, the test statistic, and the significance level α
- When to use a z-test, a t-test, or a chi-squared test
- p-value vs critical value, and Type I vs Type II error
- Forming a z-statistic and making a one-pass reject / do-not-reject decision
Before you start
Think of a hypothesis test as a small courtroom. You start by assuming the defendant is innocent — the null hypothesis, the boring default — and then ask: if that were true, how strange would the data we just saw be? If the data is strange enough, you reject the null. If not, you walk away unconvinced. You never prove the null true; you only decide whether the evidence is strong enough to overturn it.
For GATE DA the bar is mostly recognition: read a scenario, name the right test, write the right statistic. Full multi-step derivations are rare, so this lesson keeps the depth at exactly that level.
The framework
- Null hypothesis
H0— the default claim, usually “no effect” or a specific value, e.g.μ = 50. - Alternative
H1— what you suspect instead, e.g.μ ≠ 50(two-tailed) orμ > 50(one-tailed). - Test statistic — a single number measuring how far the data sits from
H0, in standard-error units. - Significance level
α— the risk you accept of rejecting a trueH0(commonlyα = 0.05). It fixes the critical value (e.g.1.96for a two-tailedzat 5%). - Decision — reject
H0if the statistic is more extreme than the critical value, or equivalently if the p-value (the probability of data this extreme underH0) is belowα.
p-value < α → reject H0 |statistic| > critical value → reject H0
p-value ≥ α → fail to reject H0 |statistic| ≤ critical value → fail to reject
Two ways to be wrong
Drag the decision threshold below and watch the trade. Slide it right and the
red area (Type I, α) shrinks but the orange area (Type II, β) grows. Bump
the effect size or the sample size and the two bells pull apart, so the power
(1 − β) climbs without you giving up any α. That is exactly why bigger
samples make tests more decisive.
Which test? z vs t vs chi-squared
This is the decision GATE most wants you to make. It hinges on what you know and what you are testing.
- z-test — for a mean when the population standard deviation
σis known (ornis large so the sample SD is reliable). The statistic uses the Normal, and the CLT from the last lesson is what justifies it. - t-test — for a mean when
σis unknown and you estimate it from the sample (s), typically with smalln. It uses Student’s t-distribution, which has heavier tails than the Normal (extra uncertainty from estimatingσ) and approaches the Normal asngrows. The statistic swapsσfors. - chi-squared (
χ²) test — for counts and categories: testing a variance, goodness-of-fit (do observed frequencies match expected?), or independence in a contingency table (a cross-tab counting how often each combination of two categories occurs). It sums(observed − expected)² / expected.
How GATE asks this
Usually an MCQ at recognition level: a scenario describing the data (“σ unknown, n = 12”) and four candidate tests — pick the right one and the right one- vs two-tailed critical value. Occasionally a short NAT asks you to plug numbers into a z- or t-statistic. A full p-value derivation is rare; what GATE reliably rewards is knowing the framework and the z/t/χ² decision cold.
Worked example — a two-tailed z-test
A machine is supposed to fill bottles to
μ0 = 50ml. The fill standard deviation is known to beσ = 8ml. A sample ofn = 64bottles has meanx̄ = 52ml. At the 5% level (two-tailed, critical value1.96), is the machine off-target?
Step 1 — hypotheses. H0: μ = 50 versus H1: μ ≠ 50 (two-tailed: “off-target”
in either direction).
Step 2 — choose the test. σ is known and n is large, so this is a
z-test.
Step 3 — standard error and statistic.
standard error = σ / √n = 8 / √64 = 8 / 8 = 1
z = (x̄ − μ0) / (σ/√n) = (52 − 50) / 1 = 2 / 1 = 2.0
Step 4 — decide. Compare |z| = 2.0 with the two-tailed 5% critical value 1.96.
Since 2.0 > 1.96, the result falls in the rejection region:
2.0 > 1.96 → reject H0
There is significant evidence at the 5% level that the machine is off-target. (Note
how close it is — had the critical value been the 1% value 2.576, we would not
reject. The threshold matters.)
Quick check
Quick check
Practice this in an interview
All questionsThe null hypothesis (H0) is the default claim of no effect or no difference, while the alternative hypothesis (H1) is what you are trying to find evidence for. Hypothesis testing asks whether the observed data is surprising enough under H0 to justify rejecting it in favor of H1.
A two-tailed test rejects H0 when the statistic is extreme in either direction; a one-tailed test rejects only in one pre-specified direction. Two-tailed tests are the default because they guard against effects in both directions; one-tailed tests are valid only when a directional hypothesis is theoretically justified and pre-registered before seeing the data.
Use a z-test when the population standard deviation is known and the sample is large (n >= 30, by convention); use a t-test when the standard deviation must be estimated from the sample, which is almost always the case in practice. For large n the two tests converge, but the t-test is the safe default.
The chi-square test assesses whether observed categorical frequencies differ from expected frequencies (goodness-of-fit) or whether two categorical variables are independent of each other (test of independence). It requires count data, a sufficiently large sample, and expected cell counts of at least 5.