datarekha
SQL Medium Asked at AmazonAsked at UberAsked at Snowflake

What is a covering index and how does it eliminate heap fetches?

The short answer

A covering index includes every column a query needs — both filter and select columns — so the engine can answer the query entirely from the index pages without touching the main table heap. This removes the costliest part of an index scan: the random I/O for each individual row fetch.

How to think about it

This question tests whether you understand the two-phase nature of an index scan. Most candidates know indexes speed things up. The ones who pass this question can explain where the time actually goes — and how a covering index short-circuits the expensive part.

The two phases of a normal index scan

When a query uses a regular index, the engine does two things:

  1. Traverse the B-tree to find matching index entries (fast — the index is sorted and narrow).
  2. Follow each entry’s row pointer back to the heap to fetch the full row (slow — these are random I/O reads, one per matching row, scattered across many disk pages).

For a query returning 10,000 rows from a large table, those 10,000 heap fetches often dominate total query time. A covering index eliminates step 2 entirely.

Without vs. with a covering index

-- Table: events(id, user_id, event_type, ts, payload)
-- Query: count events by type for a user in a date range
SELECT event_type, COUNT(*)
FROM events
WHERE user_id = 101 AND ts BETWEEN '2024-01-01' AND '2024-03-31'
GROUP BY event_type;
-- Non-covering: index on (user_id, ts)
-- Engine finds matching index entries, then fetches each heap row for event_type
CREATE INDEX idx_events_user_ts ON events (user_id, ts);

-- Covering: add event_type as a non-key INCLUDE column
-- PostgreSQL / SQL Server INCLUDE syntax (no sort overhead on interior nodes)
CREATE INDEX idx_events_covering ON events (user_id, ts) INCLUDE (event_type);

-- MySQL equivalent: extend the key
CREATE INDEX idx_events_covering ON events (user_id, ts, event_type);

After adding the covering index, EXPLAIN shows Index Only Scan (PostgreSQL) or Using index (MySQL) — the heap is never touched.

When to use INCLUDE vs. extending the key

ApproachUse when
Extend key with extra columnYou also filter or sort on that column
INCLUDE (non-key)Column is only needed in SELECT; adding it to the key wastes sort space and inflates interior node size

Costs to weigh

  • Storage: covering indexes are wider and consume more disk.
  • Write amplification: every INSERT, UPDATE, or DELETE must maintain the wider index.
  • Stale visibility map (PostgreSQL): Index Only Scans still check the visibility map per page. On a heavily updated table, heap fetches may still occur until VACUUM runs.

Interview signal

Mention that you verify coverage with EXPLAIN (ANALYZE, BUFFERS) in PostgreSQL: look for Heap Fetches: 0 under Index Only Scan. If Heap Fetches is high despite the covering index, the visibility map needs a VACUUM.

Keep practising

All SQL questions

Explore further

Skip to content