Walk me through exactly how a decision tree chooses a split at each node.

At each node the algorithm iterates over every feature and every candidate threshold, scores each candidate split by the weighted impurity of the two child nodes, and selects the pair that gives the largest impurity reduction. It then recurses on each child until a stopping criterion is met.

How do you choose the optimal decision threshold for a binary classifier?

The optimal threshold depends on the business cost of false positives versus false negatives, not on defaulting to 0.5. You choose it by plotting the PR or ROC curve on a held-out set, computing the metric that captures your cost function (e.g., F-beta, revenue, expected cost) at each threshold, and selecting the point that maximises it. Threshold tuning is free and should always precede resampling or model changes.

What is pruning in decision trees and when would you use pre-pruning versus post-pruning?

Pruning removes splits that do not improve generalisation. Pre-pruning stops growth early via hyperparameters like max_depth or min_samples_leaf. Post-pruning (cost-complexity pruning) grows the full tree then collapses nodes whose removal does not hurt held-out accuracy enough.

When would you use a multi-armed bandit or shadow deployment instead of a fixed A/B test?

A fixed A/B test holds traffic splits constant to get a clean, statistically powered comparison, which is ideal when you need a trustworthy ship decision. A multi-armed bandit dynamically shifts traffic toward the better-performing model, reducing regret when you can't run long enough for significance or when the best arm may change. Shadow deployment sends real traffic to the new model without serving its outputs, so you validate behavior and latency risk-free before any user is exposed.

Decision Trees — Making Choices Under Uncertainty — Business Analytics

Expected value scored a single bet — take the campaign or don’t. But the last lesson ended on a richer kind of choice: launch now, or pilot first and decide later based on what you learn. That’s a decision feeding a chance event feeding another decision. This lesson is the tool for exactly that shape.

Your market research says there’s a 45% chance the market is good. A full national launch could return $900k — or cost you $300k if it flops. A cautious pilot caps your downside at $50k but also caps your upside at $250k. Doing nothing costs nothing and earns nothing.

These three options feel incomparable — different upside, different downside, different probabilities. A decision tree is the tool that makes them comparable.

What a decision tree is

A decision tree lays your problem out left to right like a branching path. Two kinds of nodes (junction points) do all the work:

A square node is a decision node — a fork where you pick which branch to take (launch, pilot, skip).
A circle node is a chance node — a fork where the market picks, and each branch carries a probability (good market, bad market).

At the far right of every branch sits a payoff — the dollar outcome if the path up to that point actually happened.

The art is drawing the tree accurately. The math is called fold-back.

Fold-back: evaluate right to left

Fold-back (also called roll-back) is the procedure for collapsing the tree into a single recommended action. You work from the tips back toward the root:

At every chance node, compute the expected value — probability × payoff summed across all branches. (Expected value, introduced in the previous lesson, is the probability-weighted average outcome.)
At every decision node, keep only the branch with the highest EV and discard the others.
The EV that survives at the root is the value of the whole decision — and the surviving path is your recommendation.

Let’s do it with real numbers.

The launch decision — by hand

Here are the payoffs (in $k for readability):

Path	Probability	Payoff
Go national, market good	0.45	+$900
Go national, market bad	0.55	−$300
Run a pilot, market good	0.45	+$250
Run a pilot, market bad	0.55	−$50
Don’t launch	1.00	$0

Fold back the two chance nodes first:

EV(Go national) = 0.45 × $900 + 0.55 × (−$300)
               = $405 − $165
               = $240

EV(Run a pilot) = 0.45 × $250 + 0.55 × (−$50)
               = $112.50 − $27.50
               = $85

EV(Don't launch) = $0

At the decision node you compare $240, $85, and $0. Go national wins — at 45% odds of a good market, the expected payoff of $240 beats the pilot ($85) and doing nothing ($0).

Try it: drag the probability and watch the recommendation flip

Trydecision-tree EV

Three choices, one uncertain market

Each option's value is its expected value: every outcome's payoff weighted by its probability. Pick the highest.

P(market is good)45%National upside if good$900

Go national+$240

Run a pilot+$85

Don't launch+$0

Highest EV → Go national

The widget shows the exact tree above. Hit Go national at 45% — it highlights green at $240. Now drag P(market is good) down toward 20%. Watch the recommendation shift. At a low enough probability the national launch EV turns negative and the pilot (or even doing nothing) becomes the smart call. The tree doesn’t change; only one input changes — and the whole recommendation flips.

That is both the power and the risk of a decision tree.

Why a pilot can beat national even with a lower EV

Notice that the pilot EV ($85) is well below the national EV ($240) at 45% odds. Yet there are two reasons a rational manager might still choose the pilot:

Downside protection. The worst case nationally is −$300. The worst case for a pilot is −$50. If your company cannot absorb a $300k loss — say, it would force layoffs or kill another project — the pilot’s capped downside has real value that the EV number doesn’t capture.

Value of information. A pilot lets you learn before committing fully. Running a small test and observing real customer behaviour updates your probability estimate. If the pilot goes well you can then launch nationally with much higher confidence. The technical term is value of information — sometimes the smart move is the option that lets you learn cheaply before betting big, even if its standalone EV is lower.

Where trees go wrong

The practical implication: spend your energy getting the probabilities right, not just the payoffs. A $900k upside is irrelevant if the probability attached to it is fantasy.

Decision trees in practice

Decision trees show up wherever structured choices meet uncertain outcomes: capital allocation, product roadmaps, hiring a senior role vs. promoting internally, entering a new market. The format scales — you can nest chance nodes inside chance nodes, add more decision points later in the tree (should we double down if the pilot succeeds?), and attach costs to the act of gathering information itself.

The core discipline is always the same: draw the tree honestly, assign probabilities carefully, fold back from right to left, and let the EV guide — not override — your judgment.

In one breath

A decision tree lays a choice out left to right with two kinds of node: squares (decision nodes, where you choose) and circles (chance nodes, where the world chooses, each branch carrying a probability). Solve it by fold-back, right to left: average the branches at each chance node into an expected value, then at each decision node keep only the highest-EV branch — the number left at the root is the decision’s value and the surviving path is your recommendation. Here Go National ($240) beats Pilot ($85) and Skip ($0) at 45% odds. Two caveats keep it honest: a lower-EV option can still win on downside protection or value of information (a pilot lets you learn cheaply before betting big), and the whole tree is only as trustworthy as its probabilities — never just chase the biggest payoff.

Practice

Quick check

0/3

Q1At P(good) = 45%, what is the EV of Go National?

Q2You are evaluating two options. Option A: 60% chance of +$200, 40% chance of −$100. Option B: guaranteed $60. Fold back and choose.

Q3A pilot has a lower EV than a national launch. Under which circumstance is choosing the pilot still rational?

A question to carry forward

Go back to the moment in the explorer where dragging one slider — P(market is good) — flipped the entire recommendation from “go national” to “pilot.” That should make you slightly uneasy. The tree didn’t change; a single estimate did, and the answer reversed. And that estimate, 45%, was itself a guess.

So the question to carry forward is: when the recommendation hinges on a number you’re not sure of, how unsure can you afford to be? Which assumption, if it’s wrong, actually changes your decision — and how far does it have to move before it does? The next lesson is sensitivity analysis: the disciplined way to stress-test every shaky input, find the one or two that the decision truly pivots on, and stop worrying about the rest.

Decision Trees — Making Choices Under Uncertainty

What you'll learn

Before you start

What a decision tree is

Fold-back: evaluate right to left

The launch decision — by hand

Try it: drag the probability and watch the recommendation flip

Three choices, one uncertain market

Why a pilot can beat national even with a lower EV

Where trees go wrong

Decision trees in practice

In one breath

Practice

Quick check

A question to carry forward

Sign in to track your progress

Practice this in an interview

Related lessons

Explore further