datarekha

Backtracking

How to search a combinatorial space by building solutions step-by-step, pruning dead branches early — the engine behind N-Queens, Sudoku, and constraint satisfaction.

7 min read Advanced Data Structures & Algorithms Lesson 28 of 32

What you'll learn

  • What backtracking is: DFS over a decision tree with undo on failure
  • How the choose-explore-unchoose template applies to every backtracking problem
  • Why pruning is the difference between tractable and exponential blowup
  • How constraint satisfaction and combinatorial search appear in data and ML workflows

Before you start

Backtracking is the idea that you can build a solution one piece at a time, and whenever a partial solution cannot possibly lead to a valid complete solution, you stop and try something else.

It is depth-first search over a decision tree — a tree where each level represents one choice to make and each branch represents one option — where each node represents a partial state, each edge represents one choice, and the leaves are either valid complete solutions or dead ends. The “backtrack” step is simply undoing the last choice and moving to the next sibling edge.

The decision tree picture

Suppose you want all subsets of {A, B, C}. At each position you make a binary choice — include the element or skip it.

                       []
              /                  \
          [A]                    []
        /     \                /     \
    [A,B]    [A]           [B]       []
    /  \     /  \          / \       / \
[A,B,C][A,B][A,C][A]   [B,C][B]   [C] []

Traversing this tree depth-first and collecting the leaves gives all 8 subsets. For permutations the branching factor is larger, but the structure is identical.

The key observation: if you can detect that a partial path is already invalid — a queen is already under attack, a Sudoku row already has a repeat — you prune the entire subtree rooted at that node and never recurse into it.

The universal template

Every backtracking algorithm follows the same three-line loop:

def backtrack(state, choices):
    if is_complete(state):
        record(state)
        return
    for choice in choices:
        if is_valid(state, choice):      # prune before recursing
            make_choice(state, choice)   # choose
            backtrack(state, next_choices(state, choice))  # explore
            undo_choice(state, choice)   # un-choose

The undo_choice step is what makes it backtracking rather than plain recursion — you restore state to exactly what it was before, so the next iteration of the loop starts from a clean slate.

Classic examples

Subsets and permutations

Generating all subsets of [1, 2, 3] is the simplest case: the validity check is always true (any partial subset is valid), so there is no pruning — the work is genuinely O(2ⁿ). Permutations are similar but require tracking which elements have already been used; the validity check is “not already in the current path”, which prunes the obviously duplicate branches.

N-Queens

Place n queens on an n×n board so that no two queens share a row, column, or diagonal. The decision tree has n levels (one per row) and n branches per level (one per column). Without pruning you would examine nⁿ configurations. With pruning — reject a column choice the moment it conflicts with any already-placed queen — the search space shrinks dramatically. For n = 8 there are only 92 solutions out of 8⁸ = 16,777,216 possible placements; pruning makes this explorable in milliseconds.

Sudoku

Fill a 9×9 grid so every row, column, and 3×3 box contains each digit exactly once. The validity check is “this digit does not already appear in this row, column, or box”. Because each cell constrains its neighbors, good pruning propagation (trying only legal digits for each empty cell) reduces the effective branching factor from 9 to 1–2 on typical puzzles.

Given a grid of letters and a target word, find whether the word exists as a connected path. At each cell you try all four neighbors; the validity check is “in-bounds, not yet visited, and the next letter matches”. You mark cells visited before recursing (choose) and unmark them after (un-choose).

Run it: all subsets via backtracking

Notice that the number of subsets is exactly 2ⁿ — there is no pruning here because every partial subset is valid. The loop variable start is the structural pruning that avoids generating duplicate subsets in different orders.

N-Queens counter

The N-Queens problem is where pruning matters. Run this to see how many solutions exist for each board size:

For n = 6 there are 4 solutions; for n = 8 there are 92. The three sets (cols, diag1, diag2) give O(1) conflict checks, keeping each placement decision fast.

Complexity and pruning’s impact

Worst-case backtracking is still exponential — you cannot escape that when the solution space itself is exponential. What pruning changes is the effective branching factor: instead of exploring every branch at every level, you skip entire subtrees the moment a constraint is violated.

For N-Queens with n = 12, the naive O(nⁿ) search would examine ~8.9 × 10¹² configurations. With pruning, the actual nodes visited number in the millions — a factor of millions improvement without changing the worst-case class.

This is the practical insight: write the validity check to be as tight as possible, and check it as early as possible — before recursing, not after.

Connection to data science and ML

Backtracking appears in data work in a few concrete ways:

Constraint satisfaction. Assigning ML pipeline stages, scheduling jobs under resource limits, and allocating feature groups to model slots are all constraint satisfaction problems. Backtracking (or its industrial cousin, branch-and-bound) is the standard exact solver.

Combinatorial feature search. Exhaustive feature subset selection is the subset enumeration problem. Practitioners almost never run it to completion, but the pruning intuition — “if this subset already scores worse than the current best, skip all supersets” — is exactly backtracking logic. Branch-and-bound hyperparameter search uses the same idea.

Search in planning and config spaces. When an agent must plan a sequence of actions or explore a configuration space, the decision tree is the state space. Backtracking (with memoization added) becomes dynamic programming; with a cost function it becomes A*.

Quick check

Quick check

0/3
Q1What is the purpose of the 'un-choose' step in the backtracking template?
Q2An N-Queens backtracking solver checks for conflicts before recursing and skips any column that is already attacked. This is an example of:
Q3Generating all subsets of a 20-element set with backtracking produces how many subsets?

Practice this in an interview

All questions
Generate all subsets of a set — the power set — using backtracking.

At each step of a recursive walk through the input, you make a binary choice: include the current element or skip it. Recording the current path at every node of the recursion tree (not just the leaves) collects all 2^n subsets. A `start` index prevents duplicates by ensuring elements are only considered left-to-right.

Find all unique combinations of candidates that sum to a target, where each candidate may be used an unlimited number of times.

Use backtracking with a running total. At each step, try adding a candidate to the current path. If the total equals the target, record the path. If it exceeds the target, prune. Passing the same start index (not i+1) back into the recursion allows unlimited reuse of the same element.

Generate all permutations of a list of distinct integers using backtracking.

At each recursion level, swap one of the remaining (unused) elements into the current position, recurse to fill the rest, then swap back to restore the state. Alternatively, track a 'used' set and build the permutation in a separate path list. The result is all n! orderings.

Implement binary search correctly — and explain the off-by-one traps.

Binary search halves the search space each iteration to find a target in O(log n). The tricky part is not the idea but the boundary conditions: closed vs. half-open intervals, how to update lo/hi, and when to use lo < hi vs. lo <= hi. One clean template eliminates all the classic bugs.

Sign in to track your progress

Completed lessons, your XP, level, and streak save to your account — it's free and takes a few seconds.

Explore further

Related lessons

Skip to content