datarekha

Hypothesis Tests: z-test, t-test & chi-squared

A hypothesis test is a courtroom for data: assume the null, measure how surprising the evidence is, decide. GATE tests recognition — which test fits which situation.

9 min read Intermediate GATE DA Lesson 17 of 122

What you'll learn

  • Null vs alternative hypothesis, the test statistic, and the significance level α
  • When to use a z-test, a t-test, or a chi-squared test
  • p-value vs critical value, and Type I vs Type II error
  • Forming a z-statistic and making a one-pass reject / do-not-reject decision

Before you start

Think of a hypothesis test as a small courtroom. You start by assuming the defendant is innocent — the null hypothesis, the boring default — and then ask: if that were true, how strange would the data we just saw be? If the data is strange enough, you reject the null. If not, you walk away unconvinced. You never prove the null true; you only decide whether the evidence is strong enough to overturn it.

For GATE DA the bar is mostly recognition: read a scenario, name the right test, write the right statistic. Full multi-step derivations are rare, so this lesson keeps the depth at exactly that level.

The framework

  • Null hypothesis H0 — the default claim, usually “no effect” or a specific value, e.g. μ = 50.
  • Alternative H1 — what you suspect instead, e.g. μ ≠ 50 (two-tailed) or μ > 50 (one-tailed).
  • Test statistic — a single number measuring how far the data sits from H0, in standard-error units.
  • Significance level α — the risk you accept of rejecting a true H0 (commonly α = 0.05). It fixes the critical value (e.g. 1.96 for a two-tailed z at 5%).
  • Decision — reject H0 if the statistic is more extreme than the critical value, or equivalently if the p-value (the probability of data this extreme under H0) is below α.
p-value  <  α     →  reject H0          |statistic|  >  critical value  →  reject H0
p-value  ≥  α     →  fail to reject H0   |statistic|  ≤  critical value  →  fail to reject

Two ways to be wrong

Decision versus realityH0 is TRUEH0 is FALSEReject H0Do not rejectType I error (α)correctcorrectType II error (β)
Type I: convict the innocent (reject a true H0). Type II: acquit the guilty (miss a false H0).

Drag the decision threshold below and watch the trade. Slide it right and the red area (Type I, α) shrinks but the orange area (Type II, β) grows. Bump the effect size or the sample size and the two bells pull apart, so the power (1 − β) climbs without you giving up any α. That is exactly why bigger samples make tests more decisive.

Which test? z vs t vs chi-squared

This is the decision GATE most wants you to make. It hinges on what you know and what you are testing.

Picking the testz-testmean, withσ KNOWN(or large n)z = (x̄−μ₀)/ (σ/√n)t-testmean, withσ UNKNOWN(small n)t = (x̄−μ₀)/ (s/√n)chi-squaredvariance,goodness-of-fit,independenceΣ (O−E)²/ E
Known σ → z. Unknown σ with small n → t. Counts, variance, or a contingency table → chi-squared.
  • z-test — for a mean when the population standard deviation σ is known (or n is large so the sample SD is reliable). The statistic uses the Normal, and the CLT from the last lesson is what justifies it.
  • t-test — for a mean when σ is unknown and you estimate it from the sample (s), typically with small n. It uses Student’s t-distribution, which has heavier tails than the Normal (extra uncertainty from estimating σ) and approaches the Normal as n grows. The statistic swaps σ for s.
  • chi-squared (χ²) test — for counts and categories: testing a variance, goodness-of-fit (do observed frequencies match expected?), or independence in a contingency table (a cross-tab counting how often each combination of two categories occurs). It sums (observed − expected)² / expected.

How GATE asks this

Usually an MCQ at recognition level: a scenario describing the data (“σ unknown, n = 12”) and four candidate tests — pick the right one and the right one- vs two-tailed critical value. Occasionally a short NAT asks you to plug numbers into a z- or t-statistic. A full p-value derivation is rare; what GATE reliably rewards is knowing the framework and the z/t/χ² decision cold.

Worked example — a two-tailed z-test

A machine is supposed to fill bottles to μ0 = 50 ml. The fill standard deviation is known to be σ = 8 ml. A sample of n = 64 bottles has mean x̄ = 52 ml. At the 5% level (two-tailed, critical value 1.96), is the machine off-target?

Step 1 — hypotheses. H0: μ = 50 versus H1: μ ≠ 50 (two-tailed: “off-target” in either direction).

Step 2 — choose the test. σ is known and n is large, so this is a z-test.

Step 3 — standard error and statistic.

standard error  =  σ / √n      =  8 / √64   =  8 / 8   =  1

z  =  (x̄ − μ0) / (σ/√n)  =  (52 − 50) / 1  =  2 / 1  =  2.0

Step 4 — decide. Compare |z| = 2.0 with the two-tailed 5% critical value 1.96. Since 2.0 > 1.96, the result falls in the rejection region:

2.0  >  1.96     →     reject H0

There is significant evidence at the 5% level that the machine is off-target. (Note how close it is — had the critical value been the 1% value 2.576, we would not reject. The threshold matters.)

Quick check

Quick check

0/5
Q1A z-test has hypothesised mean μ0 = 100, known σ = 15, sample mean x̄ = 106, and n = 25. Compute the z-statistic, z = (x̄ − μ0)/(σ/√n). (1 decimal)numerical answer — type a number
Q2A researcher takes a sample of n = 10 measurements and does NOT know the population standard deviation. Which test is appropriate for the mean?
Q3Which statements are TRUE? (select all that apply)select all that apply
Q4In a hypothesis test, what does the significance level α represent?
Q5A two-tailed z-test gives |z| = 1.8 at α = 0.05 (critical value 1.96). What is the decision?

Practice this in an interview

All questions
What is the difference between the null and alternative hypothesis?

The null hypothesis (H0) is the default claim of no effect or no difference, while the alternative hypothesis (H1) is what you are trying to find evidence for. Hypothesis testing asks whether the observed data is surprising enough under H0 to justify rejecting it in favor of H1.

What is the difference between one-tailed and two-tailed hypothesis tests, and when is each appropriate?

A two-tailed test rejects H0 when the statistic is extreme in either direction; a one-tailed test rejects only in one pre-specified direction. Two-tailed tests are the default because they guard against effects in both directions; one-tailed tests are valid only when a directional hypothesis is theoretically justified and pre-registered before seeing the data.

When do you use a t-test versus a z-test?

Use a z-test when the population standard deviation is known and the sample is large (n >= 30, by convention); use a t-test when the standard deviation must be estimated from the sample, which is almost always the case in practice. For large n the two tests converge, but the t-test is the safe default.

What is the chi-square test, and when do you use it?

The chi-square test assesses whether observed categorical frequencies differ from expected frequencies (goodness-of-fit) or whether two categorical variables are independent of each other (test of independence). It requires count data, a sufficiently large sample, and expected cell counts of at least 5.

Sign in to track your progress

Completed lessons, your XP, level, and streak save to your account — it's free and takes a few seconds.

Explore further

Related lessons

Skip to content