How does a B-tree index work, and when does the database choose not to use it?
A B-tree index stores key values in a balanced tree of sorted nodes, allowing the engine to reach any value in O(log n) page reads instead of scanning every row. The optimizer skips the index when the estimated cost of random I/O exceeds a full-table scan, when a function wraps the indexed column, or when the query returns such a large fraction of rows that a sequential scan is cheaper.
How to think about it
The interviewer wants to know whether you understand why indexes are fast for some queries and useless for others. The real answer isn’t “it’s a tree” — it’s about sorted storage, binary search, and when the optimizer decides that random I/O isn’t worth it.
What a B-tree actually stores
A B-tree (Balanced Tree) index is the default index type in PostgreSQL, MySQL, SQL Server, and Oracle. Think of it like a sorted phone book with tabs: each internal node is a tab telling you which range of the alphabet to flip to, and the leaf pages are the actual entries.
Concretely:
- Every non-leaf node stores separator keys and child pointers (the “tabs”).
- Leaf nodes store the actual indexed values plus row pointers (heap TIDs or clustered-key references).
- All leaf nodes are linked in sorted order, so range scans walk them left-to-right without revisiting upper levels.
Lookup path
- Start at the root page (cached in shared buffer pool after first access).
- Binary-search each internal node to follow the correct child pointer.
- Reach the leaf page; follow the row pointer to the heap page.
- For a range
WHERE id BETWEEN 100 AND 200, scan contiguous leaf pages left to right.
A typical B-tree on a million-row table is only 3–4 levels deep — that’s 3–4 page reads versus a million for a full scan.
When the optimizer ignores the index
This is the follow-up interviewers always ask. Three main reasons:
1. Non-sargable predicate — function on the indexed column
-- Non-sargable: the index stores raw `created_at` values, not YEAR() values
SELECT * FROM orders WHERE YEAR(created_at) = 2024;
-- Sargable rewrite: range on the raw column — index works perfectly
SELECT * FROM orders WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';
2. Low selectivity — too many rows would match
If a status column has only two values (active / inactive) and 80 % of rows are active, scanning the index and then jumping to 800 000 random heap pages costs more than one sequential scan pass. The optimizer does the math and skips the index.
3. Missing leading column in a composite index
An index on (region, year) can’t answer a lookup on year alone — the left-most column must be present.
Key insight: write-path cost
Every index you add is paid for on every INSERT, UPDATE, and DELETE — the engine must update each index tree. On a high-write table, four indexes can cut write throughput in half. Fewer indexes is not laziness; it is a deliberate trade-off.