What is a scalar subquery, where can it appear in a SQL statement, and what happens if it returns more than one row?
A scalar subquery is a subquery that returns exactly one column and one row, and can appear anywhere a single value is valid — SELECT list, WHERE clause, HAVING clause, or even a JOIN ON condition. If it returns more than one row at runtime the database raises a runtime error, not a compile-time error.
How to think about it
The word “scalar” just means “a single value.” A scalar subquery is a subquery that reduces to exactly one column and one row — so it can slot into any position where you’d write a literal number or string. The interesting parts are where it can appear and what breaks it.
Where scalar subqueries can appear
In the SELECT list — attach a derived value from another table to each row:
SELECT
e.name,
e.salary,
(SELECT AVG(salary) FROM employees) AS company_avg,
e.salary - (SELECT AVG(salary) FROM employees) AS diff_from_avg
FROM employees e;
In a WHERE clause — filter against a derived threshold:
SELECT product_id, revenue
FROM sales
WHERE revenue = (SELECT MAX(revenue) FROM sales);
As a correlated lookup — one value per outer row, like a single-value lateral join:
SELECT
d.department_name,
(SELECT e.name
FROM employees e
WHERE e.department_id = d.department_id
ORDER BY e.salary DESC
LIMIT 1) AS top_earner
FROM departments d;
Try it — scalar subquery in SELECT vs WHERE
The “more than one row” runtime error
The database cannot know at parse time whether a correlated subquery will return one row or many — it depends on the data. If it returns more than one row at runtime, you get an error like:
ERROR: more than one row returned by a subquery used as an expression
This fires only when a match exists, so queries that work fine in staging can blow up in production as data grows. The LIMIT 1 pattern is the usual fix — but it silently discards ties, which may not be what you want.
Better alternatives for per-row lookups
For the “top earner per department” pattern, a window function is cleaner and lets the optimizer run the scan once:
-- Cleaner: one pass over employees, not one subquery per department
SELECT department_id, name, salary
FROM (
SELECT department_id, name, salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rk
FROM employees
) ranked
WHERE rk = 1;