datarekha

Tries (Prefix Trees)

A tree where the path from root to a node spells a string. Tries give O(L) insert and lookup — independent of vocabulary size — and make prefix queries trivially fast.

7 min read Intermediate Data Structures & Algorithms Lesson 18 of 32

What you'll learn

  • Why a trie stores characters on edges, not nodes — and how shared prefixes share structure
  • O(L) insert and lookup where L is the word length, not the dictionary size
  • How prefix search is O(L) + subtree scan, unlocking fast autocomplete
  • Where tries appear in production — autocomplete, IP routing, spell-check, and tokenizer vocabularies

Before you start

A trie (pronounced “try”, short for retrieval) is a tree where the path from the root spells a string. Each edge is labelled with a character. To store the word "cat", you walk root → cat, marking that final node as an end-of-word. To store "car", you walk root → car. The c → a prefix is shared — those two edges are the same nodes.

That shared-prefix property is the entire point.

The structure

Every node represents a position in the string built so far:

  • The root represents the empty prefix.
  • Each child edge extends the prefix by one character.
  • A node marked as end-of-word means the prefix up to that node is a complete word in the dictionary.

A node can be both end-of-word and have children. The word "do" ends at the o node, but "dog" and "done" extend further through that same node.

Try it

Insert a few words (try care, done, cat) and watch the traversal. Words with a shared prefix — like car and card — reuse the path up to where they diverge. Only the new suffix gets new nodes.

Switch to Prefix search and type ca: every word beginning with those two characters lights up. That is the autocomplete operation, and it costs O(L) to reach the branching point, then O(k) to collect the k completions in the subtree.

Time complexity

Let L be the length of the word.

OperationTimeWhy
InsertO(L)Walk or create one node per character
Lookup (exact)O(L)Walk one node per character, check end-of-word
Prefix searchO(L + k)O(L) to reach end of prefix, O(k) to collect k completions

The critical insight: none of these costs depend on how many words are in the trie. That is because lookup never compares against other words — it simply follows one edge per character, guided entirely by the query string itself. A hash set also gives O(1) average lookup, but it cannot answer prefix queries — it has no structure to exploit. A trie trades some memory for the ability to do prefix queries cheaply.

Compare to a sorted array of strings: exact lookup is O(L log n), and prefix search requires a binary search plus a scan. Prefix search on a trie is simply a subtree traversal.

Building one in Python

The simplest implementation uses a plain dict of dicts. Each node is a dictionary mapping characters to child nodes. An end-of-word sentinel key marks complete words.

The "$" sentinel works because $ is not a valid letter, so it never collides with a child edge. Some implementations use a dedicated is_end boolean on a node class instead — same idea, slightly cleaner.

Memory trade-off

Each node stores up to 26 child pointers (for lowercase English). A sparse trie — many short, similar strings — is efficient. A trie with completely unrelated words pays the overhead of allocating a node per character. In practice:

  • Compressed tries (also called Patricia trees or radix trees — trees where each edge can represent multiple characters) merge chains of single-child nodes into one edge labelled with the whole substring.
  • Array-backed tries pre-allocate a fixed 26-element array per node; faster but wasteful if the alphabet is large.
  • Hash-map tries (as above) are the default starting point — flexible, readable.

For the purposes of DSA interviews and most production autocomplete services, the dict-of-dicts approach is sufficient and is what you should reach for first.

When a trie beats a hash map

Use a trie when:

  1. Prefix queries matter — autocomplete, type-ahead, URL routing.
  2. Shared prefix compression matters — if your dictionary has 50,000 words starting with un, a trie stores that prefix once; a hash map stores it 50,000 times inside each key.
  3. Lexicographic ordering — iterating a trie in DFS order visits words alphabetically for free.

Use a hash map (or set) when you only need exact-match lookup and memory is tight.

Quick check

Quick check

0/3
Q1A trie holds 100,000 words. What is the time complexity to look up a single 8-character word?
Q2Which operation does a trie support more efficiently than a hash set?
Q3You insert 'cat' and then 'car' into an empty trie. How many nodes are created in total (including the root)?

Practice this in an interview

All questions
Group a list of words into anagrams.

Sort each word's characters to get a canonical key, then bucket words by that key using a hash map. This turns an O(n²) brute-force comparison into a clean O(n · k log k) single pass, where k is the max word length.

Generate all subsets of a set — the power set — using backtracking.

At each step of a recursive walk through the input, you make a binary choice: include the current element or skip it. Recording the current path at every node of the recursion tree (not just the leaves) collects all 2^n subsets. A `start` index prevents duplicates by ensuring elements are only considered left-to-right.

Find all unique combinations of candidates that sum to a target, where each candidate may be used an unlimited number of times.

Use backtracking with a running total. At each step, try adding a candidate to the current path. If the total equals the target, record the path. If it exceeds the target, prune. Passing the same start index (not i+1) back into the recursion allows unlimited reuse of the same element.

How does a B-tree index work, and when does the database choose not to use it?

A B-tree index stores key values in a balanced tree of sorted nodes, allowing the engine to reach any value in O(log n) page reads instead of scanning every row. The optimizer skips the index when the estimated cost of random I/O exceeds a full-table scan, when a function wraps the indexed column, or when the query returns such a large fraction of rows that a sequential scan is cheaper.

Sign in to track your progress

Completed lessons, your XP, level, and streak save to your account — it's free and takes a few seconds.

Explore further

Related lessons

Skip to content