datarekha

Dynamic Programming

How to turn exponential recursion into polynomial-time solutions by solving each subproblem exactly once — the technique behind spell-check, sequence alignment, and optimal planning.

10 min read Advanced Data Structures & Algorithms Lesson 27 of 32

What you'll learn

  • What overlapping subproblems and optimal substructure mean — and why both are required
  • Memoization (top-down) vs tabulation (bottom-up): when to reach for each
  • How edit distance works: filling a 2D table to find the cheapest sequence of edits
  • Where DP shows up in production: spell-check, record dedup, and sequence alignment

Before you start

Dynamic programming is a name that intimidates, but the idea underneath is clean: if you are going to solve the same subproblem more than once, solve it once and remember the answer.

Two structural properties must both be true before DP applies.

Overlapping subproblems — the recursive breakdown revisits identical smaller inputs more than once. Fibonacci is the textbook case — computing fib(10) calls fib(9) and fib(8), which both call fib(7), and so on. The same work is done an exponential number of times without memoization.

Optimal substructure — the best solution to the whole problem can be composed from best solutions to its parts (knowing the optimal answer to each subproblem is enough to assemble the global optimum). Shortest-path algorithms rely on this: the shortest path from A to C through B is the shortest A-to-B path plus the shortest B-to-C path.

When both hold, you get a choice of two equivalent styles.

Memoization — top-down

Memoization wraps the recursive solution with a cache. You write the algorithm the natural way (recurse), but before doing real work you check whether you have already computed this input.

from functools import lru_cache

@lru_cache(maxsize=None)
def fib(n):
    if n <= 1:
        return n
    return fib(n - 1) + fib(n - 2)

print(fib(50))   # instant; without caching this takes ~2^50 calls

lru_cache stores the result the first time fib(k) is called and returns it instantly on every subsequent call. The call tree goes from O(2ⁿ) to O(n) time and O(n) space.

Tabulation — bottom-up

Tabulation inverts the direction: instead of recursing down and caching on the way back up, you build a table from the smallest subproblems upward.

def fib_tab(n):
    if n <= 1:
        return n
    dp = [0] * (n + 1)
    dp[1] = 1
    for i in range(2, n + 1):
        dp[i] = dp[i - 1] + dp[i - 2]
    return dp[n]

Same O(n) time, same O(n) space — but no recursion stack, no function-call overhead, and with a bit more thought you can often reduce space to O(1) by keeping only the last two values.

A pattern: the 0/1 knapsack

Before moving to the main worked example, notice how the pattern generalises. The 0/1 knapsack problem asks: given items with weights and values, what is the maximum value you can fit in a bag of capacity W? Each item is either taken (1) or not (0).

The recurrence is:

dp[i][w] = max(dp[i-1][w],             # skip item i
               dp[i-1][w - weight[i]] + value[i])  # take item i

A 2D table, filled bottom-up. Every DP problem has a recurrence like this — find it, and the rest is bookkeeping.

Edit distance — the full worked example

Edit distance (Levenshtein distance) is the minimum number of single-character insertions, deletions, and substitutions needed to turn one string into another.

The recurrence for dp[i][j] — cost to convert a[0..j-1] into b[0..i-1]:

if a[j-1] == b[i-1]:
    dp[i][j] = dp[i-1][j-1]           # characters match — free
else:
    dp[i][j] = 1 + min(
        dp[i-1][j-1],   # substitute
        dp[i-1][j],     # delete  (from a)
        dp[i][j-1],     # insert  (into a)
    )

The base cases fill the first row and column: converting a prefix of a to an empty string costs j deletions; converting empty to a prefix of b costs i insertions.

Step through the table above. Each active cell consults exactly three neighbors (diagonal, above, left), picks the minimum, and records where it came from. Once the table is full, backtracing those pointers from the bottom-right corner gives you the actual sequence of edits.

Run it yourself

The playground below runs memoized Fibonacci and a compact edit-distance implementation inside your browser (Pyodide). Change the words or the Fibonacci target and re-run.

Quick check

Quick check

0/3
Q1A problem has overlapping subproblems but NOT optimal substructure. Can dynamic programming give you the globally optimal answer?
Q2Naive recursive Fibonacci is O(2^n). After adding memoization, what is the time complexity?
Q3In the edit-distance recurrence, dp[i][j] = 1 + min(dp[i-1][j-1], dp[i-1][j], dp[i][j-1]) when characters differ. Which operation does dp[i-1][j] represent?

Practice this in an interview

All questions
Implement Fibonacci with memoization in Python. What problem does memoization solve, and what is the time complexity before and after?

Naive recursive Fibonacci is O(2^n) because it recomputes the same subproblems exponentially. Memoization caches results of subproblems, reducing time to O(n) with O(n) space. Python's functools.lru_cache makes this a one-line decorator.

Find all unique combinations of candidates that sum to a target, where each candidate may be used an unlimited number of times.

Use backtracking with a running total. At each step, try adding a candidate to the current path. If the total equals the target, record the path. If it exceeds the target, prune. Passing the same start index (not i+1) back into the recursion allows unlimited reuse of the same element.

Find the length of the longest common subsequence (LCS) of two strings.

Build a 2-D DP table where dp[i][j] is the LCS length of the first i characters of text1 and first j characters of text2. If the characters match, dp[i][j] = dp[i-1][j-1] + 1; otherwise, dp[i][j] = max(dp[i-1][j], dp[i][j-1]). The answer is dp[m][n].

Generate all subsets of a set — the power set — using backtracking.

At each step of a recursive walk through the input, you make a binary choice: include the current element or skip it. Recording the current path at every node of the recursion tree (not just the leaves) collects all 2^n subsets. A `start` index prevents duplicates by ensuring elements are only considered left-to-right.

Sign in to track your progress

Completed lessons, your XP, level, and streak save to your account — it's free and takes a few seconds.

Explore further

Related lessons

Skip to content