Why is SELECT * considered bad practice in production SQL?
SELECT * fetches every column at query time, so adding or reordering columns in the table silently breaks downstream code, wastes I/O on columns you never use, and prevents the optimizer from doing index-only scans. Always name the columns you need.
How to think about it
The interviewer is checking that you can explain the concrete consequences, not just recite “it’s bad practice.” There are three distinct reasons, and each has a different mechanism.
Reason 1 — schema drift silently breaks downstream code
SELECT * is resolved at query time against whatever columns currently exist. Add a column, drop a column, or reorder columns in the table, and every consumer of that query sees a different shape — without any error or warning.
Code that unpacks results positionally (row[3], df.iloc[:,3]) reads the wrong field. ETL pipelines that cast columns by position load garbage data. This class of bug is especially painful because it often surfaces hours or days after a schema migration, not immediately.
-- orders has 8 columns today
SELECT * FROM orders WHERE status = 'pending';
-- Next month a DBA adds a column between column 3 and 4.
-- Any downstream code using positional access now reads the wrong fields.
-- Explicit columns are immune to this:
SELECT order_id, customer_id, amount, created_at
FROM orders WHERE status = 'pending';
Reason 2 — unnecessary I/O from reading unused columns
The database must read every column from storage — including wide TEXT, JSON, or BLOB columns you never use. On row-oriented storage (PostgreSQL, MySQL), the engine reads the full row page. On columnar storage (BigQuery, Redshift, Snowflake), SELECT * disables columnar pruning entirely and reads every column’s file. Either way, more data moves from disk to memory.
SELECT * FROM events; ← reads all 40 columns, including a 4KB JSON payload
SELECT event_type, user_id FROM events; ← reads only 2 small columns
Reason 3 — blocks index-only (covering-index) scans
This is the performance reason interviewers most want to hear.
If your query touches only a small set of columns and an index covers those columns, the engine can answer the query directly from the index without touching the table at all — an index-only scan.
SELECT * forces the engine to fetch the full row from the heap for every result, even when the index already has all the data you need.
The exception — inside CTE chains
The one defensible use of SELECT * is inside a CTE chain where the outer CTE has already projected exactly what you need:
WITH filtered AS (
SELECT order_id, customer_id, amount -- explicit projection here
FROM orders
WHERE status = 'pending'
)
SELECT * FROM filtered; -- * is fine here — shape is already locked
Even then, naming the columns in the final SELECT makes intent clearer for anyone reading the query later.