SQL Medium Asked at AmazonAsked at GoogleAsked at Microsoft

What is the difference between a correlated and an uncorrelated subquery, and when does the distinction matter for performance?

For Data Analyst Data Scientist Data Engineer

The short answer

An uncorrelated subquery executes once and its result is fed into the outer query; a correlated subquery references columns from the outer query and re-executes for every row the outer query processes. The distinction matters because correlated subqueries can silently turn an O(n) query into O(n²).

How to think about it

What the interviewer is really checking is whether you understand the execution semantics — not just “correlated references the outer query,” but what that means at runtime. A correlated subquery that runs once per outer row is a loop hidden inside your SQL, and on a big table it quietly becomes O(n²).

Uncorrelated: runs exactly once

An uncorrelated subquery is self-contained. The engine evaluates it once, caches the single scalar, and reuses it for every outer row:

-- the inner SELECT runs once and returns one number
SELECT employee_id, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

The global average is computed once, then each salary is compared against it.

Correlated: runs once per outer row

A correlated subquery references a column from the outer query, so the engine can’t cache it — the answer changes per row, and it re-runs for every one:

-- the inner SELECT re-runs once per employee row
SELECT e.employee_id, e.name, e.salary
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.department_id = e.department_id   -- ties back to the outer row
);

This finds employees earning above their own department’s average — correct, but with 500 employees the inner query fires 500 times.

The rewrite that fixes it

Almost any correlated subquery becomes a CTE that pre-aggregates once, then joins:

-- dept_avg runs once PER DEPARTMENT (3 times), not once per employee (9 times)
WITH dept_avg AS (
    SELECT department_id, AVG(salary) AS avg_sal
    FROM employees
    GROUP BY department_id
)
SELECT e.employee_id, e.name, e.salary, ROUND(d.avg_sal) AS dept_avg
FROM employees e
JOIN dept_avg d ON e.department_id = d.department_id
WHERE e.salary > d.avg_sal
ORDER BY e.department_id, e.salary DESC;

employee_id	name	salary	dept_avg
1	Aarav	120000	105000.0
6	Farah	95000	82333.0
7	Guo	150000	133333.0
8	Hana	140000	133333.0

Same answer as the correlated version — each person who out-earns their department’s average — but the average is computed three times (once per department) instead of nine (once per employee). On a real table that’s the difference between one aggregation and hundreds of thousands.

When it actually matters

The correlated form isn’t always wrong — sometimes it’s the clearest way to phrase complex logic, and a good optimizer rewrites it to a join anyway. The tell is in the plan: when EXPLAIN shows SubPlan or Loops: N on a large table, that’s your cue to rewrite.

Learn it properly Subqueries

What is the difference between a correlated and an uncorrelated subquery, and when does the distinction matter for performance?

Uncorrelated: runs exactly once

Correlated: runs once per outer row

The rewrite that fixes it

When it actually matters

Keep practising

Explore further