What is a covering index and how does it eliminate heap fetches?
A covering index includes every column a query needs — both filter and select columns — so the engine can answer the query entirely from the index pages without touching the main table heap. This removes the costliest part of an index scan: the random I/O for each individual row fetch.
How to think about it
This question tests whether you understand the two-phase nature of an index scan. Most candidates know indexes speed things up. The ones who pass this question can explain where the time actually goes — and how a covering index short-circuits the expensive part.
The two phases of a normal index scan
When a query uses a regular index, the engine does two things:
- Traverse the B-tree to find matching index entries (fast — the index is sorted and narrow).
- Follow each entry’s row pointer back to the heap to fetch the full row (slow — these are random I/O reads, one per matching row, scattered across many disk pages).
For a query returning 10,000 rows from a large table, those 10,000 heap fetches often dominate total query time. A covering index eliminates step 2 entirely.
Without vs. with a covering index
-- Table: events(id, user_id, event_type, ts, payload)
-- Query: count events by type for a user in a date range
SELECT event_type, COUNT(*)
FROM events
WHERE user_id = 101 AND ts BETWEEN '2024-01-01' AND '2024-03-31'
GROUP BY event_type;
-- Non-covering: index on (user_id, ts)
-- Engine finds matching index entries, then fetches each heap row for event_type
CREATE INDEX idx_events_user_ts ON events (user_id, ts);
-- Covering: add event_type as a non-key INCLUDE column
-- PostgreSQL / SQL Server INCLUDE syntax (no sort overhead on interior nodes)
CREATE INDEX idx_events_covering ON events (user_id, ts) INCLUDE (event_type);
-- MySQL equivalent: extend the key
CREATE INDEX idx_events_covering ON events (user_id, ts, event_type);
After adding the covering index, EXPLAIN shows Index Only Scan (PostgreSQL) or Using index (MySQL) — the heap is never touched.
When to use INCLUDE vs. extending the key
| Approach | Use when |
|---|---|
| Extend key with extra column | You also filter or sort on that column |
| INCLUDE (non-key) | Column is only needed in SELECT; adding it to the key wastes sort space and inflates interior node size |
Costs to weigh
- Storage: covering indexes are wider and consume more disk.
- Write amplification: every INSERT, UPDATE, or DELETE must maintain the wider index.
- Stale visibility map (PostgreSQL): Index Only Scans still check the visibility map per page. On a heavily updated table, heap fetches may still occur until
VACUUMruns.
Interview signal
Mention that you verify coverage with EXPLAIN (ANALYZE, BUFFERS) in PostgreSQL: look for Heap Fetches: 0 under Index Only Scan. If Heap Fetches is high despite the covering index, the visibility map needs a VACUUM.