BFS, DFS, Topological Sort & Shortest Path
BFS spreads level by level, DFS plunges deep, topological sort linearises a DAG, and Dijkstra handles weights. The four traversals GATE keeps asking about.
What you'll learn
- BFS uses a queue and gives shortest paths in an UNWEIGHTED graph; DFS uses a stack and goes deep first
- Topological sort linearises a DAG; it exists only for a DAG and is usually not unique
- Counting the valid topological orderings of a small DAG (a real 2024 question type)
- Shortest path: BFS for unweighted graphs, Dijkstra (greedy, non-negative weights) for weighted ones
Before you start
With the graph stored, the next question is how to visit its vertices. Four ideas cover almost everything GATE asks: BFS and DFS (the two traversals), topological sort (ordering a DAG), and shortest path (BFS when unweighted, Dijkstra when weighted). These are not just exam fodder — topological sort is exactly how Spark and Airflow decide the run order of a task DAG, and shortest-path search is the core of routing, recommendation, and graph-analytics work.
BFS and DFS — the two traversals
- BFS (Breadth-First Search) uses a queue (FIFO). Enqueue the source, then
repeatedly dequeue a vertex and enqueue its unvisited neighbours. It fans out in
rings: every vertex at distance
dis reached before any at distanced + 1. That is exactly why BFS gives the shortest path in an unweighted graph — the first time you reach a vertex, you reached it in the fewest edges. - DFS (Depth-First Search) uses a stack (or recursion). It commits to one branch, plunging as deep as possible before backtracking to the last fork. DFS is the engine behind cycle detection and topological sort.
Both visit every reachable vertex exactly once (a visited set prevents loops); they differ only in the order. Step through both side by side here:
Topological sort — only for a DAG
A topological sort of a directed acyclic graph is a linear ordering of its
vertices such that every edge u → v points forward (u appears before v). It is
the natural answer to “in what order can I run these tasks so each prerequisite comes
first?”
Two facts GATE leans on:
- A topological order exists if and only if the graph is a DAG. A directed cycle makes it impossible — the vertices on the cycle would each have to come before the other.
- It is usually not unique. Whenever two vertices have no path forcing an order between them, either can come first, and each independent choice multiplies the count of valid orderings.
Shortest path — BFS vs Dijkstra
When edges are unweighted, BFS already solves shortest paths — its layer numbers are the minimum edge counts. Add weights (a toll road costs more than a side street) and BFS breaks, because four cheap edges can beat one expensive edge.
Dijkstra’s algorithm fixes this with one greedy rule: always expand the cheapest-so-far unvisited vertex next, and relax its neighbours (if going through this vertex beats a neighbour’s recorded distance, lower it). It is correct only when all edge weights are non-negative. Step through a relaxation cascade here:
How GATE asks this
Expect a NAT for “the BFS distance from S to vertex X” or “the number of valid
topological orderings of this DAG”, and an MCQ/MSQ on properties — which of several
sequences are valid topological orderings (the 2024 paper’s Q.51 was exactly this
MSQ; its DAG admits 12 orderings in all), when a topological sort exists, BFS vs DFS
data structures, or when you must switch from BFS to Dijkstra. Read carefully for the
word weighted: it decides BFS vs Dijkstra.
Worked example
Topological order of a DAG. Take edges
A → B, A → C, B → D, C → D. Vertex A has no incoming edge, so it goes first;
D has every other vertex before it, so it goes last. B and C are independent of
each other, so either can come second. Two valid orders:
A, B, C, D and A, C, B, D
So this DAG has 2 valid topological orderings — a direct illustration that the order exists (it is a DAG) but is not unique.
BFS distances on the 5-vertex graph above. Run BFS from S on the undirected,
unweighted graph drawn earlier (edges S-A, S-B, A-B, A-C, B-D):
queue: [S] visit S, dist[S] = 0
dequeue S → enqueue A, B dist[A] = 1, dist[B] = 1
dequeue A → enqueue C dist[C] = 2 (B already seen)
dequeue B → enqueue D dist[D] = 2
So dist = {S:0, A:1, B:1, C:2, D:2}. The farthest vertices are C and D at
distance 2 — the minimum number of edges to reach each from S.
Quick check
Quick check
Practice this in an interview
All questionsUse a queue (deque) to process nodes layer by layer. At each step, snapshot the current queue length to know exactly how many nodes belong to the current level, drain those, then enqueue their children. The result is a list of lists without any depth-tracking variable.
Scan every cell. When you find a '1' that hasn't been visited, increment the island count and immediately flood-fill all connected land cells (DFS or BFS) so they won't be counted again. The total number of floods equals the number of islands.
Airflow models pipelines as Directed Acyclic Graphs (DAGs) of tasks, each with defined dependencies. The scheduler triggers DAG runs based on a cron schedule, passing each run a logical execution date rather than the wall-clock time. A backfill re-runs a DAG over a historical date range, allowing you to populate data for past periods after adding a new pipeline or fixing a bug — as long as tasks are idempotent.
Hierarchical clustering builds a tree of nested merges or splits and does not require specifying k upfront, but it is O(n² log n) and cannot revise early decisions. DBSCAN finds arbitrarily shaped clusters by density reachability, naturally marks outliers as noise, and also needs no k — but its results are sensitive to the eps and min_samples hyperparameters.