Shortest Paths (Dijkstra)
BFS finds the path with fewest hops; Dijkstra finds the path with least total cost. Learn relaxation, the min-heap frontier, and why non-negative weights are required.
What you'll learn
- Why BFS is not enough when edges have costs, and how a priority queue fixes it
- How edge relaxation works and why it is the engine behind Dijkstra
- The O((V+E) log V) complexity and when a binary heap is the right structure
- Where Dijkstra fits in the shortest-path family: BFS, Bellman-Ford, and A*
Before you start
BFS is perfect for unweighted graphs: every edge costs the same, so the first time you reach a node you have found the shortest path to it. Add edge weights — a toll road costs more than a side street, a slow link adds more latency than a fast one — and BFS breaks. It cannot distinguish “four cheap edges” from “four expensive edges.”
Dijkstra’s algorithm solves this with a single rule: always expand the cheapest-so-far frontier node next. That greedy choice, backed by a priority queue, produces the correct shortest path to every reachable node.
The intuition
Imagine you are navigating a city. Some roads have tolls; others are free. You keep a notepad: for every intersection you have reached, you write down the cheapest known route to get there. Each time you have to move, you leave from the intersection with the smallest total cost on your notepad — not the geographically closest one, the cheapest-to-reach one.
When you leave a node you do two things:
- Lock in its distance — because every edge weight is non-negative, no future detour through other nodes can produce a cheaper route to this intersection; any path that continues from here can only stay the same or get more expensive.
- Relax its neighbors — for each neighbor, check whether going through this node beats the neighbor’s current notepad entry. If so, update it.
That process — “check a neighbor; update if cheaper” — is called relaxation (the tentative distance relaxes downward toward the true shortest distance). It is the heart of the algorithm.
Step through it live
Pick any source node, then step through the algorithm one move at a time. Watch how the min-heap frontier (the sorted list on the right) always hands you the cheapest unvisited node, and how each relaxation may cascade updates through the distance table.
Relaxation in detail
For an edge (u, v) with weight w, the relaxation check is:
if dist[u] + w < dist[v]:
dist[v] = dist[u] + w
prev[v] = u # remember how we got here
Starting from a source with dist[source] = 0 and everything else at infinity, every relaxation shrinks a tentative distance. A node is “finalized” the moment it is popped from the priority queue — at that point no future path can improve its distance.
Why a min-heap?
Dijkstra’s efficiency depends on how fast you can answer “which unvisited node has the smallest distance?” A naive scan over all nodes costs O(V) per step, giving O(V²) total — fine for dense graphs but slow for sparse ones. A binary heap supports both operations in O(log V):
| Operation | Binary heap | Naive array |
|---|---|---|
| Extract minimum | O(log V) | O(V) |
| Decrease key | O(log V) | O(1) |
| Total | O((V+E) log V) | O(V²) |
In practice, Python’s heapq is a min-heap. The standard trick is to push (tentative_dist, node) tuples — the heap orders by the first element automatically.
Runnable implementation
Run it: the distances should match what you observed in the widget above.
The negative-weight problem
Complexity summary
| What | Cost |
|---|---|
| With a binary heap | O((V+E) log V) |
| With a Fibonacci heap | O(V log V + E) — better for dense graphs, rarely worth the code |
| With a plain array | O(V²) — fine when E is close to V² |
For most practical uses — road networks, network routing, sparse graphs — the binary-heap version is the right default.
The shortest-path family
Dijkstra is not the only option. Knowing where it sits helps you pick the right tool:
| Algorithm | Handles negative weights | Handles negative cycles | Complexity |
|---|---|---|---|
| BFS | No (unweighted only) | — | O(V+E) |
| Dijkstra | No | — | O((V+E) log V) |
| Bellman-Ford | Yes | Detects them | O(V*E) |
| A* | No | — | O((V+E) log V), often faster in practice with a good heuristic |
A* is Dijkstra with a heuristic function h(n) that estimates the remaining distance to the target. If the heuristic is admissible (never overestimates), A* is guaranteed to find the shortest path while typically exploring far fewer nodes than plain Dijkstra.
Where this shows up in the real world
Shortest paths are not just about maps:
- IP routing — OSPF and IS-IS protocols use Dijkstra to compute routing tables. Every router knows the cheapest path to every other router in the area.
- Network latency optimization — content delivery networks route requests through the lowest-latency hop sequence.
- Map directions — your navigation app runs a variant of Dijkstra (or A*) on a graph of road segments with travel-time weights.
- Recommendation graphs — “distance” in a user-item graph (inverse of similarity) lets you find the most relevant items to recommend.
- Compiler dependency resolution — build systems find the shortest chain of dependencies to rebuild when a file changes.
Anywhere you have a weighted graph and a question of the form “what is the cheapest path from X to Y,” Dijkstra is the starting point.
Quick check
Quick check
Practice this in an interview
All questionsScan every cell. When you find a '1' that hasn't been visited, increment the island count and immediately flood-fill all connected land cells (DFS or BFS) so they won't be counted again. The total number of floods equals the number of islands.
Use a queue (deque) to process nodes layer by layer. At each step, snapshot the current queue length to know exactly how many nodes belong to the current level, drain those, then enqueue their children. The result is a list of lists without any depth-tracking variable.
Maintain a second 'min stack' in parallel: every push also records the current minimum at that moment. When you pop the main stack, pop the min stack too. The top of the min stack is always the current minimum — no scanning needed.