Tries (Prefix Trees)
A tree where the path from root to a node spells a string. Tries give O(L) insert and lookup — independent of vocabulary size — and make prefix queries trivially fast.
What you'll learn
- Why a trie stores characters on edges, not nodes — and how shared prefixes share structure
- O(L) insert and lookup where L is the word length, not the dictionary size
- How prefix search is O(L) + subtree scan, unlocking fast autocomplete
- Where tries appear in production — autocomplete, IP routing, spell-check, and tokenizer vocabularies
Before you start
A trie (pronounced “try”, short for retrieval) is a tree where the path from the root spells a string. Each edge is labelled with a character. To store the word "cat", you walk root → c → a → t, marking that final node as an end-of-word. To store "car", you walk root → c → a → r. The c → a prefix is shared — those two edges are the same nodes.
That shared-prefix property is the entire point.
The structure
Every node represents a position in the string built so far:
- The root represents the empty prefix.
- Each child edge extends the prefix by one character.
- A node marked as end-of-word means the prefix up to that node is a complete word in the dictionary.
A node can be both end-of-word and have children. The word "do" ends at the o node, but "dog" and "done" extend further through that same node.
Try it
Insert a few words (try care, done, cat) and watch the traversal. Words with a shared prefix — like car and card — reuse the path up to where they diverge. Only the new suffix gets new nodes.
Switch to Prefix search and type ca: every word beginning with those two characters lights up. That is the autocomplete operation, and it costs O(L) to reach the branching point, then O(k) to collect the k completions in the subtree.
Time complexity
Let L be the length of the word.
| Operation | Time | Why |
|---|---|---|
| Insert | O(L) | Walk or create one node per character |
| Lookup (exact) | O(L) | Walk one node per character, check end-of-word |
| Prefix search | O(L + k) | O(L) to reach end of prefix, O(k) to collect k completions |
The critical insight: none of these costs depend on how many words are in the trie. That is because lookup never compares against other words — it simply follows one edge per character, guided entirely by the query string itself. A hash set also gives O(1) average lookup, but it cannot answer prefix queries — it has no structure to exploit. A trie trades some memory for the ability to do prefix queries cheaply.
Compare to a sorted array of strings: exact lookup is O(L log n), and prefix search requires a binary search plus a scan. Prefix search on a trie is simply a subtree traversal.
Building one in Python
The simplest implementation uses a plain dict of dicts. Each node is a dictionary mapping characters to child nodes. An end-of-word sentinel key marks complete words.
The "$" sentinel works because $ is not a valid letter, so it never collides with a child edge. Some implementations use a dedicated is_end boolean on a node class instead — same idea, slightly cleaner.
Memory trade-off
Each node stores up to 26 child pointers (for lowercase English). A sparse trie — many short, similar strings — is efficient. A trie with completely unrelated words pays the overhead of allocating a node per character. In practice:
- Compressed tries (also called Patricia trees or radix trees — trees where each edge can represent multiple characters) merge chains of single-child nodes into one edge labelled with the whole substring.
- Array-backed tries pre-allocate a fixed 26-element array per node; faster but wasteful if the alphabet is large.
- Hash-map tries (as above) are the default starting point — flexible, readable.
For the purposes of DSA interviews and most production autocomplete services, the dict-of-dicts approach is sufficient and is what you should reach for first.
When a trie beats a hash map
Use a trie when:
- Prefix queries matter — autocomplete, type-ahead, URL routing.
- Shared prefix compression matters — if your dictionary has 50,000 words starting with
un, a trie stores that prefix once; a hash map stores it 50,000 times inside each key. - Lexicographic ordering — iterating a trie in DFS order visits words alphabetically for free.
Use a hash map (or set) when you only need exact-match lookup and memory is tight.
Quick check
Quick check
Practice this in an interview
All questionsSort each word's characters to get a canonical key, then bucket words by that key using a hash map. This turns an O(n²) brute-force comparison into a clean O(n · k log k) single pass, where k is the max word length.
At each step of a recursive walk through the input, you make a binary choice: include the current element or skip it. Recording the current path at every node of the recursion tree (not just the leaves) collects all 2^n subsets. A `start` index prevents duplicates by ensuring elements are only considered left-to-right.
Use backtracking with a running total. At each step, try adding a candidate to the current path. If the total equals the target, record the path. If it exceeds the target, prune. Passing the same start index (not i+1) back into the recursion allows unlimited reuse of the same element.
A B-tree index stores key values in a balanced tree of sorted nodes, allowing the engine to reach any value in O(log n) page reads instead of scanning every row. The optimizer skips the index when the estimated cost of random I/O exceeds a full-table scan, when a function wraps the indexed column, or when the query returns such a large fraction of rows that a sequential scan is cheaper.