State the law of total probability and give a concrete example of when you'd apply it.

The law of total probability decomposes P(A) over a mutually exclusive, exhaustive partition of the sample space: P(A) = Σ P(A|Bᵢ)·P(Bᵢ). It is the engine behind the Bayes denominator and any calculation where you want an overall rate built from segment-level rates.

What is conditional probability, and how does it differ from joint probability?

Conditional probability P(A|B) is the probability of A given that B has already occurred, computed as P(A and B) / P(B). It narrows the sample space to B, whereas joint probability P(A and B) lives in the full, unrestricted space.

Walk me through Bayes' theorem with a disease-screening base-rate example.

Bayes' theorem updates a prior probability with new evidence: P(H|E) = P(E|H) P(H) / P(E). In disease testing, ignoring the low base rate (prior) makes a positive test look far more alarming than it really is — most positives are false positives when the disease is rare.

Explain joint, marginal, and conditional distributions and how to move between them.

The joint distribution P(X, Y) fully specifies two random variables together. Marginals P(X) and P(Y) are obtained by summing (or integrating) the joint over the other variable. Conditionals P(X|Y=y) are the joint sliced at a fixed y value, renormalized by the marginal P(Y=y).

Conditional & Total Probability — GATE DA

Conditional & Total Probability

You learn it rained last night, and the chance the ground is wet leaps from a guess to near-certain — nothing changed but what you know. That re-judging is conditional probability, and the law of total probability stitches the pieces back together, setting up Bayes.

8 min read Intermediate GATE DA Lesson 6 of 122

What you'll learn

Conditional probability P(A|B) = P(A∩B)/P(B) — re-judging in the light of B

The multiplication rule P(A∩B) = P(A|B)·P(B), and how it chains

The law of total probability: splitting across exhaustive cases and recombining

Why P(A|B) and P(B|A) differ — the gap Bayes exists to close

Step outside one morning not knowing last night’s weather, and the chance the ground is wet might be one in three. Now someone tells you it rained. In that instant the chance leaps to almost certain. Notice that nothing about the ground itself changed in that moment — only what you know changed. Re-judging a chance in the light of something you have just learned is the whole of this lesson.

Putting the re-judging on paper

We do this re-judging in our heads all day. To do it carefully we need a name and a rule. The chance of A once we know B has happened is written P(A | B), read “the probability of A given B”. And the rule is this: knowing B shrinks the world down to just the cases where B is true, and inside that smaller world we ask what fraction also have A.

P(A | B) = P(A ∩ B) / P(B),     for P(B) > 0

The denominator P(B) is the shrunk-down world; the numerator P(A ∩ B) is the slice of it that also has A. Multiply the rule across and it rearranges into the multiplication rule, often the handier form:

P(A ∩ B) = P(A | B) · P(B)

— the chance both happen equals the chance B happens, times the chance of A once B has. It chains to more events the same way, one condition at a time: P(A ∩ B ∩ C) = P(A)·P(B | A)·P(C | A ∩ B), which is exactly what you do when you walk down a probability tree, multiplying along the path.

Drag the two circles below and turn on “Given B”: everything outside B fades, and you are left re-reading the chance of A inside the shrunken B-world. That fading is what “conditioning” is.

Tryconditional probability

Drag the events — conditioning shrinks the universe

A ⫫ B (independent)

Drag a circle to move it, or the small dot on its edge to resize. Dots are a fixed Monte Carlo sample of the universe.

P(A)0.166

P(B)0.136

P(A ∩ B)0.027

P(A | B)P(A∩B) / P(B)0.200

P(B | A)0.164

P(A)·P(B)0.023

P(A ∩ B)0.027

These match — A and B are independent.

Toggle Condition on B to dim everything outside B — conditioning throws away the rest of the universe.

Building a chance out of cases

Sometimes you cannot read P(A) off directly, but you can split the world into a few exhaustive, non-overlapping cases and find A’s chance inside each. Then you recombine, weighting each case by how likely it was to begin with.

Picture a factory whose parts come from two machines. Machine 1 makes 60% of the parts and 2% of its parts are defective; Machine 2 makes the other 40% and 5% of its parts are defective. What is the chance a part picked at random is defective? Walk the tree: down each branch to the machine, then on to “defective”, multiplying as you go.

Multiply along each path to the event, then add the paths that reach it.

That recombining is the law of total probability: if cases B₁, B₂, … are exhaustive and non-overlapping, P(A) = Σ P(A | Bᵢ) · P(Bᵢ). Here it gives 0.6·0.02 + 0.4·0.05 = 0.012 + 0.020 = 0.032 — a 3.2% defect rate, built from two cases.

The two conditionals are not the same

Now a warning that the whole next lesson rests on. We were given P(defective | Machine 2) = 0.05. But flip the question — given that a part is defective, what is the chance it came from Machine 2? That is P(Machine 2 | defective), a different number entirely. Divide that machine’s path by the total: 0.020 / 0.032 ≈ 0.625. So P(defective | M2) = 0.05 while P(M2 | defective) = 0.625 — the two conditionals point in opposite directions and must never be swapped.

A question to carry forward

We just turned P(defective | machine) into P(machine | defective) by dividing one path by the total. Here is the thread for the next lesson: that move — flipping a conditional around using the total-probability denominator — has a name and a single clean formula. What is the rule that takes any P(B | A) and hands you back P(A | B)?

In one breath

Conditioning re-judges a chance once you know B: shrink to the B-world, then P(A|B) = P(A∩B)/P(B).
Multiplication rule: P(A∩B) = P(A|B)·P(B); it chains — P(A∩B∩C) = P(A)·P(B|A)·P(C|A∩B) (walk the tree).
Law of total probability: split into exhaustive disjoint cases, recombine — P(A) = Σ P(A|Bᵢ)·P(Bᵢ) (factory defect = 0.6·0.02 + 0.4·0.05 = 0.032).
The trap: P(A|B) ≠ P(B|A) — P(defective|M2)=0.05 but P(M2|defective)=0.625. Flipping that conditional is Bayes, next.
“Without replacement” draws are dependent, so the second chance is a conditional, not the first.

Practice

Quick check

0/6

Q1Recall: what does the denominator P(B) represent in P(A|B) = P(A∩B)/P(B)?

Q2Trace: using the factory (Machine 1 makes 60% at 2% defective, Machine 2 makes 40% at 5% defective), compute the overall chance a part is defective.numerical answer — type a number

Q3Trace: a box has 4 red and 6 blue balls. Two are drawn WITHOUT replacement. Compute P(both red).numerical answer — type a number

Q4Apply: which statements are ALWAYS true (for P(B) > 0)? (select all that apply)select all that apply

Q5Apply: in the factory, you are told P(defective | Machine 2) = 0.05. A quality inspector instead wants P(Machine 2 | defective). What is it, and why does it differ?

Q6Create: a test is 90% accurate at flagging a disease that 1 in 100 people have. A friend says 'I tested positive, so I'm 90% likely to have it.' Identify the error in one line.

Conditional & Total Probability

What you'll learn

Before you start

Putting the re-judging on paper

Drag the events — conditioning shrinks the universe

Building a chance out of cases

The two conditionals are not the same

A question to carry forward

In one breath

Practice

Quick check

Sign in to track your progress

Practice this in an interview

Related lessons

Explore further