What does CTE materialization mean, when does it help versus hurt, and how do you control it in PostgreSQL?
Materialization means the database executes the CTE once, stores its result in a temporary structure, and re-reads that structure for each reference. It helps when the CTE is expensive and referenced many times, but hurts when the optimizer needs to push predicates through the CTE — which it cannot do across a materialization boundary.
How to think about it
This is a Hard question because most candidates know what a CTE is but don’t know that the optimizer can treat it two completely different ways — and that the wrong choice silently prevents index use. The interviewer wants to see that you understand what happens under the hood, not just the syntax.
Inlining vs. materialization
There are two ways the optimizer can handle a CTE:
- Inlined CTE — the optimizer treats the CTE as a macro: it substitutes the definition wherever the CTE is referenced. Predicate pushdown, join reordering, and index selection work as if the CTE weren’t there.
- Materialized CTE — the engine evaluates the CTE once, spools the rows into a working table (often in memory), and serves subsequent references from that buffer. Predicates from the outer query cannot be pushed inside the CTE boundary.
PostgreSQL history matters
Before PostgreSQL 12, all CTEs were always materialized — they acted as optimization fences. This was a well-known footgun: wrapping a query in a CTE could silently disable index use. From PostgreSQL 12 onward, non-recursive CTEs are inlined by default, and you have explicit keyword control.
Explicit control
-- Force materialization: useful when the CTE is expensive
-- and referenced several times; avoids re-computation
WITH expensive_agg AS MATERIALIZED (
SELECT region, SUM(amount) AS total
FROM orders
GROUP BY region
)
SELECT a.region, a.total, b.total AS last_year_total
FROM expensive_agg a
JOIN expensive_agg b ON a.region = b.region;
-- Reads from the buffer for the second reference, not re-computed
-- Force inline: useful when the outer query is highly selective
-- and you want the optimizer to push filters inside
WITH recent AS NOT MATERIALIZED (
SELECT * FROM events WHERE event_type = 'click'
)
SELECT * FROM recent WHERE user_id = 42;
-- With NOT MATERIALIZED, the planner can push user_id = 42 inside
-- and use an index on (user_id, event_type)
Decision guide
| Scenario | Use |
|---|---|
| CTE referenced once, outer query is selective | NOT MATERIALIZED (or rely on default inline) |
| CTE is expensive, referenced 3+ times | MATERIALIZED |
| CTE contains a side effect (data-modifying CTE in PostgreSQL) | Always materialized regardless of keyword |
| Need a stable snapshot within a transaction | MATERIALIZED |
In BigQuery, Snowflake, and SQL Server, CTEs are inlined by default and there is no keyword override. Use a temp table or CREATE TEMP TABLE to force materialization in those engines.