What does the No Free Lunch theorem state, and what are its practical implications for choosing algorithms?

The NFL theorem proves that averaged over all possible data distributions, no learning algorithm outperforms any other — including random guessing. In practice it means there is no universally best algorithm; the right choice depends on inductive biases that match the actual problem structure.

When should you optimize precision and when should you optimize recall?

Optimize precision when a false positive is costly — spam filters, ad targeting, legal evidence — because you'd rather miss some positives than act on wrong ones. Optimize recall when a false negative is costly — cancer screening, fraud detection, safety systems — because missing a true positive can be catastrophic. The business cost of each error type should drive the choice, not the metric itself.

You are a robber. Given an array of house values, find the maximum money you can rob without robbing two adjacent houses.

At each house you make one choice: rob it (and skip the previous) or skip it (and carry forward whatever you had). dp[i] = max(dp[i-1], dp[i-2] + nums[i]). Since you only look back two steps, two variables replace the full array, giving O(n) time and O(1) space.

When should you use grid search vs random search vs Bayesian optimisation for hyperparameter tuning?

Grid search exhaustively tries every combination in a predefined grid, which is only practical for 1–2 hyperparameters. Random search samples combinations uniformly at random and finds good values faster per compute budget, especially when only a few hyperparameters actually matter. Bayesian optimisation fits a surrogate model of the objective and proposes the next trial intelligently, giving the best sample efficiency for expensive evaluations.

Greedy Algorithms — DSA

What you'll learn

What makes a strategy greedy: one irrevocable, locally optimal choice per step

When greedy is provably correct — interval scheduling and canonical coin systems

How to catch a wrong greedy with a small counterexample (coins 1, 3, 4; target 6)

Why dynamic programming is the fallback, and how to tell the difference

A greedy algorithm makes the locally best choice at each step and never looks back. No reconsidering, no backtracking, no storing of what-ifs — you take the best-looking option in front of you, move on, and repeat.

That simplicity is what makes greedy algorithms fast, and exactly what makes them risky to trust without proof. The whole skeleton is: define “best available option right now”, take it, shrink the problem, repeat. The hard part is the first line — pick the wrong local rule and you get an answer that looks fine but is not optimal.

When greedy is provably right: interval scheduling

You have activities, each with a start and finish time, and one room. You want to fit the most non-overlapping activities. The winning rule is always take the activity that finishes earliest among those that still fit:

Taking the earliest finish each time fits four activities. The two long decoys finish late and would block more than they add.

Why is earliest-finish the right rule, and not, say, shortest-activity? Because finishing as early as possible leaves the most room for whatever comes next. A formal exchange argument makes it airtight: any optimal schedule can be reshaped, step by step, into the earliest-finish schedule without ever losing an activity.

def activity_selection(activities):
    selected, last_finish = [], -1
    for start, finish in sorted(activities, key=lambda a: a[1]):  # by finish time
        if start >= last_finish:                # compatible with the last pick
            selected.append((start, finish))
            last_finish = finish
    return selected

print(activity_selection([(0,2), (3,5), (6,8), (9,11), (1,7), (4,10)]))

[(0, 2), (3, 5), (6, 8), (9, 11)]

The two long activities (1,7) and (4,10) are skipped — each would occupy time that two shorter ones could share. The same provably-correct greedy shape powers Huffman coding (merge the two rarest symbols each step) and making change in a canonical coin system like US coins (always take the largest coin that fits).

When greedy quietly fails

Take coins {1, 3, 4} and a target of 6. Largest-coin-first greedy grabs 4, leaving 2, which forces two 1s — three coins. But 3 + 3 is two coins. Greedy committed to the 4 and could never reconsider, so it walked past the better answer. The US coin set escapes this only because it is canonical — mathematically verified so the greedy rule holds at every target. An arbitrary coin set has no such guarantee, and the only safe checks are a proof or exhaustive testing.

	Greedy	Dynamic programming
Choices	One irrevocable pick per step	Explores all sub-choices
Reconsiders?	Never	Always
Speed	Usually O(n log n)	Usually O(n²) or O(n·k)
Correct when	The greedy-choice property holds	There is overlapping substructure

Greedy approximations are everywhere in ML precisely because the exact answer is too expensive: decision trees pick the single best split at each node, forward feature selection adds the most-helpful feature each step, and beam search keeps only the top-k candidates while decoding. Each trades provable optimality for speed — knowingly.

Practice

Quick check

0/3

Q1Coins {1, 5, 10, 25}, make 41 cents. Greedy gives 25+10+5+1 = 4 coins. Optimal?

Q2Coins {1, 3, 4}, target 6. Greedy returns 3 coins. Why is that not optimal?

Q3In activity selection, why is 'earliest finish' the right rule rather than 'shortest activity'?

Greedy Algorithms

What you'll learn

Before you start

When greedy is provably right: interval scheduling

When greedy quietly fails

Practice

Quick check

Sign in to track your progress

Practice this in an interview

Related lessons

Explore further