Given a query that filters on both a raw column and an aggregate result, how do you structure it for correctness and performance?
Raw-column filters belong in WHERE so the engine scans fewer rows before grouping. Aggregate filters must go in HAVING. Applying a filter in HAVING that could have been in WHERE forces the engine to aggregate more rows than necessary.
How to think about it
The question is testing whether you know SQL’s logical execution order and can use it to write both correct and efficient queries. The core idea: filter as early as possible, only using HAVING for what WHERE literally cannot do.
The execution order mental model
SQL evaluates clauses in this order:
WHERE fires on individual rows before any grouping happens. HAVING fires on groups after aggregation is complete. This means:
- Raw column conditions (
country = 'US',status = 'active') belong inWHERE— they can cut the row count before the expensiveGROUP BYstep. - Aggregate conditions (
COUNT(*) > 10) must go inHAVING— the aggregate doesn’t exist yet atWHEREtime.
Build the query step by step
Suppose you want: for US orders in 2024, find customers with more than 10 orders and total spend above $500.
Start with the raw filters in WHERE:
FROM orders
WHERE country = 'US'
AND order_date >= '2024-01-01'
AND status = 'active'
Then group and apply the aggregate filters in HAVING:
GROUP BY customer_id
HAVING COUNT(*) > 10
AND SUM(amount) > 500
Why this matters for performance
Moving raw-column filters from WHERE into HAVING doesn’t change the result — but it forces the database to aggregate all rows across all countries, all years, all statuses, and then throw most of them away after the fact. On a table with millions of rows that difference can be enormous.