Random Variables, PMF & CDF
A random variable turns outcomes into numbers; the PMF lists their probabilities and the CDF accumulates them. The shared language of every distribution that follows.
What you'll learn
- A random variable maps each outcome of an experiment to a real number
- Discrete vs continuous random variables — and where the PMF applies
- PMF: p(x) ≥ 0 and the probabilities sum to 1
- CDF F(x) = P(X ≤ x): non-decreasing, right-continuous, a step function for a discrete RV
- Reading interval probabilities off the CDF: P(a < X ≤ b) = F(b) − F(a)
Before you start
Toss three coins. Count the heads. You’ve just done the thing a random variable is named for — turning an outcome into a number, so we can chart it, average it, do arithmetic on it. The number is “discrete” when it sits on a countable list like 0, 1, 2, 3 heads, and “continuous” when it can land anywhere in a range (a waiting time, a measurement). This lesson is the discrete case.
Two functions tell you everything: the PMF (the list of probabilities, one per value) and the CDF (the running total). This is not just exam vocabulary: a classifier’s output layer (a softmax) is a PMF over the possible classes, and the CDF is what lets a computer draw random samples from any distribution. Most GATE questions on this topic boil down to “given a small PMF, read off some probability” — and they almost always pivot on one CDF property the exam loves to test.
PMF and CDF — list versus running total
The probability mass function p(x) = P(X = x) gives the probability of each
individual value. Two rules make it valid: every mass is non-negative, and the masses
sum to 1.
p(x) ≥ 0 for every x and Σ p(x) = 1
The cumulative distribution function F(x) = P(X ≤ x) is the running total of the
masses up to and including x. Because it accumulates non-negative masses, the CDF can
only climb or stay flat — and for a discrete RV it climbs in jumps, one jump at each
value, the jump height equal to that value’s mass. Between values it is flat, so the
graph is a step function.
The CDF properties GATE tests
A 2025 MSQ asked precisely which of these hold for every CDF. Memorise the list:
- Non-decreasing —
F(x)never goes down asxincreases. - Right-continuous — at a jump the CDF takes the value after the jump, not before.
- Limits at the ends —
F(−∞) = 0andF(+∞) = 1. - Step function for a discrete RV — flat stretches joined by jumps of height
p(x).
From the running-total view, one identity does most of the interval work:
P(a < X ≤ b) = F(b) − F(a)
— the mass accumulated up to b, minus the mass already counted up to a.
How GATE asks this
Two shapes. The first is an MSQ on the bullet list above: a grid of statements
(“the CDF is non-decreasing,” “the CDF is left-continuous,” “F(+∞) = 1,” “p(x) can
exceed 1”) where you select all the true ones — exactly the 2025 question. The second
is a NAT: a small PMF table is given and you must compute a single probability such
as P(X ≤ 2) or P(1 < X ≤ 3), either by summing masses or by differencing the CDF.
Worked example — build a CDF, read off probabilities
A discrete random variable
Xtakes values 0, 1, 2, 3 with PMFp = 0.1, 0.3, 0.4, 0.2. Find the CDF, thenP(X ≤ 2)andP(1 < X ≤ 3).
First check it is a valid PMF: all masses are non-negative and
0.1 + 0.3 + 0.4 + 0.2 = 1. Now accumulate to get the CDF:
x 0 1 2 3
p(x) 0.1 0.3 0.4 0.2
F(x) 0.1 0.4 0.8 1.0 ← running total
P(X ≤ 2) = F(2) = 0.8
P(1 < X ≤ 3) = F(3) − F(1) = 1.0 − 0.4 = 0.6
P(X ≤ 2) is just the CDF value at 2 (the running total 0.1 + 0.3 + 0.4). The
interval P(1 < X ≤ 3) excludes X = 1 but includes X = 3, so it is the masses at 2
and 3, 0.4 + 0.2 = 0.6 — which is exactly F(3) − F(1).
Quick check
Quick check
Practice this in an interview
All questionsEach distribution has a natural generative story: Bernoulli is a single coin flip; Binomial sums Bernoullis; Poisson counts rare arrivals; Normal emerges from sums of many small effects; Exponential models waiting times between Poisson events; Uniform assigns equal probability across a range. Choosing correctly comes from matching that story to the data-generating process.
The joint distribution P(X, Y) fully specifies two random variables together. Marginals P(X) and P(Y) are obtained by summing (or integrating) the joint over the other variable. Conditionals P(X|Y=y) are the joint sliced at a fixed y value, renormalized by the marginal P(Y=y).
A Bernoulli(p) trial is the atomic unit: a single experiment with success probability p. Binomial(n, p) is the sum of n independent, identically distributed Bernoulli(p) trials, counting total successes. Because Binomial is a sum of independent random variables, its mean and variance are n times those of a single Bernoulli.
Binomial counts successes in a fixed number of independent trials with a fixed success probability. Poisson counts events in a continuous interval when events are rare and arrive independently at a constant average rate. Poisson is the limiting case of Binomial as n → ∞ and p → 0 with np = λ fixed.