datarekha

A* Search

Add what a step actually cost to what the heuristic guesses is left, then always expand the smallest sum. That's A*.

9 min read Intermediate GATE DA Lesson 100 of 122

What you'll learn

  • A* expands the node with the lowest f(n) = g(n) + h(n) from a priority queue
  • g(n) is cost-so-far from the start; h(n) is the heuristic guess of cost-to-go
  • Admissible h means A* is optimal; consistent h means no node is expanded twice
  • Hand-tracing the expansion order on a small weighted graph, with alphabetical tie-breaks

Before you start

Greedy search rushes toward the goal using only the heuristic’s guess. Uniform-cost search ignores the goal entirely and just minimises the cost so far. Both miss the trick. A* is the small idea that fuses them: at every step, expand the node whose total estimated path cost — what you’ve already paid plus what the heuristic thinks is left — is smallest.

That balance is why A* is the textbook search algorithm. With a sensible heuristic it still finds the cheapest path, and it usually expands far fewer nodes than uniform-cost search to get there. It is also the workhorse behind real routing and pathfinding — the same f = g + h rule plans the shortest drive in a maps app and the path a game character walks across a map.

The evaluation function

f(n)=g(n)+h(n)total estimatecost so far (known)cost to go (heuristic)
f(n) blends what you’ve measured (g) with what you’ve estimated (h).
  • g(n) — actual cost of the cheapest path found so far from start to n.
  • h(n) — heuristic estimate of cost from n to the nearest goal.
  • f(n) — the priority key. Expand the smallest f from the frontier.

Two heuristic properties decide the algorithm’s guarantees:

  • Admissibleh(n) never overestimates the true cost to a goal (h(n) <= h*(n)). With an admissible h, A* is optimal: the first goal it pops is on the cheapest path.
  • Consistent (monotonic) — for every edge n -> n' with cost c, h(n) <= c + h(n'). Consistency implies admissibility and guarantees every node is expanded at most once.

A small graph to expand

Five nodes, a few weighted edges, one start S and one goal G. Each node carries its heuristic value; we will walk the frontier step by step.

1253Sh=4Ah=3Bh=1Gh=0
Edge weights are step costs; h at each node is the heuristic-to-goal.

How GATE asks this

A real NAT/MCQ pattern: a tiny weighted graph with heuristic values printed on each node, and “List the order in which A* expands nodes” or “What is the f-value when node X is expanded?”. Standing convention: alphabetical tie-break when two frontier nodes share the lowest f. This pattern appeared in both GATE DA 2024 and 2025.

Worked example — tracing the expansion order

Using the graph above. Start S, goal G. Initialise the frontier with S.

Step 1. Frontier: {S(g=0, f=0+4=4)}. Expand S. Push neighbours:

  • A via S: g = 0 + 1 = 1, f = 1 + 3 = 4
  • B via S: g = 0 + 2 = 2, f = 2 + 1 = 3

Frontier: {A(f=4), B(f=3)}.

Step 2. Lowest f is B (f = 3). Expand B. Push G via B:

  • G via B: g = 2 + 3 = 5, f = 5 + 0 = 5

Frontier: {A(f=4), G(f=5)}.

Step 3. Lowest f is A (f = 4). Expand A. Push G via A:

  • G via A: g = 1 + 5 = 6, f = 6 + 0 = 6

But G was already on the frontier with g = 5, which is cheaper. The new path through A is worse and gets discarded. Frontier: {G(f=5)}.

Step 4. Pop G — goal reached on the path S -> B -> G with total cost 5.

So the expansion order is S, B, A, G (or S, B, A if you stop at “non-goal expansions”; many GATE keys count G’s pop as the fourth expansion). The optimal path cost is 5 and matches the f-value of G when it was popped.

Quick check

Quick check

0/6
Q1Using the graph in the lesson (edges S-A=1, S-B=2, A-G=5, B-G=3; h(S)=4, h(A)=3, h(B)=1, h(G)=0), what is the f-value of node B when A* expands it?numerical answer — type a number
Q2Same graph as above. How many node expansions does A* perform before (and including) popping the goal G? Count G's pop as one expansion.numerical answer — type a number
Q3Consider edges S-X=2, S-Y=2, X-G=4, Y-G=4 with h(S)=5, h(X)=4, h(Y)=4, h(G)=0. After expanding S, which node does A* expand next under alphabetical tie-breaking?
Q4Which statements about A* are TRUE? (select all that apply)select all that apply
Q5A heuristic gives h(start) = 7 while the true cheapest path costs 5. Is this heuristic admissible?
Q6On a graph with start S and goal G, A* uses h(n) = (straight-line distance to G). g(S) = 0 and h(S) = 12. What is f(S)?numerical answer — type a number

Practice this in an interview

All questions
Find all unique combinations of candidates that sum to a target, where each candidate may be used an unlimited number of times.

Use backtracking with a running total. At each step, try adding a candidate to the current path. If the total equals the target, record the path. If it exceeds the target, prune. Passing the same start index (not i+1) back into the recursion allows unlimited reuse of the same element.

When should you use grid search vs random search vs Bayesian optimisation for hyperparameter tuning?

Grid search exhaustively tries every combination in a predefined grid, which is only practical for 1–2 hyperparameters. Random search samples combinations uniformly at random and finds good values faster per compute budget, especially when only a few hyperparameters actually matter. Bayesian optimisation fits a surrogate model of the objective and proposes the next trial intelligently, giving the best sample efficiency for expensive evaluations.

Find the contiguous subarray with the largest sum (Kadane's algorithm).

Scan left to right, carrying a running sum. At each element, decide: extend the existing subarray (add to the running sum) or start fresh (take just the current element). Whichever is larger becomes the new running sum. Track the global maximum throughout. One pass, O(n) time, O(1) space.

How does the Spark Catalyst optimizer work, and what does Adaptive Query Execution add?

Catalyst is a rule-based and cost-based query optimizer that transforms a logical plan through four phases — analysis, logical optimization, physical planning, and code generation — before any data is touched. Adaptive Query Execution (AQE), introduced in Spark 3, extends this by re-optimizing the physical plan at runtime using actual shuffle statistics rather than stale estimates.

Sign in to track your progress

Completed lessons, your XP, level, and streak save to your account — it's free and takes a few seconds.

Explore further

Related lessons

Skip to content