When do you use the Poisson distribution versus the Binomial, and how do they relate?
Binomial counts successes in a fixed number of independent trials with a fixed success probability. Poisson counts events in a continuous interval when events are rare and arrive independently at a constant average rate. Poisson is the limiting case of Binomial as n → ∞ and p → 0 with np = λ fixed.
How to think about it
The Binomial and Poisson are the two workhorses for count data. Choosing correctly depends on whether you have a fixed number of trials or a rate over a continuous medium.
Binomial — fixed trials
X ~ Binomial(n, p) when:
- You run exactly n independent trials.
- Each trial succeeds with the same probability p.
- You count the total successes.
P(X = k) = C(n,k) · p^k · (1-p)^(n-k)
E[X] = np, Var(X) = np(1-p)
Example: in 20 email sends, each with 30 % open rate, how many opens? X ~ Bin(20, 0.3).
Poisson — rate over an interval
X ~ Poisson(λ) when:
- Events arrive independently at a constant average rate λ per interval.
- Any single tiny sub-interval has negligible probability of two events.
- The number of trials is effectively infinite or undefined.
P(X = k) = e^(-λ) · λ^k / k!
E[X] = λ, Var(X) = λ
Example: a server receives an average of 12 requests per second; model arrivals in a 1-second window as Poisson(12).
The limiting connection
When n is large and p is small, Binomial(n, p) ≈ Poisson(λ = np). A rule of thumb: use Poisson when n ≥ 20 and p ≤ 0.05.
Numerically: Bin(100, 0.03) vs Poisson(3) — both give P(X=2) ≈ 0.224.
When to choose each
| Signal | Reach for |
|---|---|
| Fixed n, known p | Binomial |
| Rate per time/area/volume | Poisson |
| n large, p tiny, np moderate | Either (Poisson easier) |
| Variance ≠ mean | Neither — consider Negative Binomial |