State the law of total probability and give a concrete example of when you'd apply it.
The law of total probability decomposes P(A) over a mutually exclusive, exhaustive partition of the sample space: P(A) = Σ P(A|Bᵢ)·P(Bᵢ). It is the engine behind the Bayes denominator and any calculation where you want an overall rate built from segment-level rates.
How to think about it
The law of total probability is the structured way to build an unconditional probability from conditional ones. It appears in Bayes’ denominator, in Simpson’s paradox analysis, and whenever a population is stratified.
The statement
Let {B₁, B₂, …, Bₙ} be a partition of the sample space — mutually exclusive and exhaustive, each with P(Bᵢ) > 0. Then for any event A:
P(A) = P(A|B₁)·P(B₁) + P(A|B₂)·P(B₂) + ··· + P(A|Bₙ)·P(Bₙ)
= Σᵢ P(A|Bᵢ)·P(Bᵢ)
This is just the weighted average of the conditional rates, weighted by how common each segment is.
Worked numeric example
An e-commerce site has three traffic sources: organic (60 % of users), paid search (30 %), and social (10 %). Conversion rates by source are 4 %, 7 %, and 2 % respectively.
P(converts) = 0.04×0.60 + 0.07×0.30 + 0.02×0.10
= 0.024 + 0.021 + 0.002
= 0.047 (4.7 %)
Now you can also plug this into Bayes: “given a conversion, what’s the probability the user came from paid search?”
P(paid | converts) = (0.07×0.30) / 0.047 = 0.021/0.047 ≈ 0.447
Despite being only 30 % of traffic, paid search drives about 45 % of conversions.
Connection to Bayes
The Bayes denominator P(Evidence) is exactly this law applied to the hypotheses:
P(E) = P(E|H)·P(H) + P(E|Hᶜ)·P(Hᶜ)
You cannot compute a posterior without first computing this normalizing constant.