Can you GROUP BY a derived expression or a SELECT alias, and how does this differ across databases?
You can always GROUP BY a derived expression written directly. Whether you can reference a SELECT alias in GROUP BY depends on the database: MySQL and BigQuery allow it, while PostgreSQL and SQL Server do not because aliases are not resolved until after GROUP BY in the logical order.
How to think about it
This question tests whether you understand SQL’s logical execution order. The key fact: GROUP BY is evaluated before SELECT, which means SELECT aliases don’t exist yet when GROUP BY runs — in strict engines.
Grouping by an expression — always works
You can group by any expression as long as you write it out fully in GROUP BY. This works everywhere:
Grouping by a SELECT alias — engine-dependent
Some databases are lenient and let you reference the alias you defined in SELECT. Others follow the strict SQL standard and reject it because the alias is not yet defined at GROUP BY evaluation time.
-- Works in MySQL and BigQuery:
SELECT substr(order_date, 1, 7) AS month, SUM(revenue)
FROM orders
GROUP BY month; -- alias resolved early in these engines
-- Errors in PostgreSQL and SQL Server:
-- ERROR: column "month" does not exist
The safe, portable fix is to either repeat the expression or wrap it in a subquery:
-- Portable approach using a subquery
SELECT month, SUM(revenue)
FROM (
SELECT substr(order_date, 1, 7) AS month, revenue
FROM orders
) sub
GROUP BY month;
Grouping by column position — avoid it
GROUP BY 1 means “group by the first column in SELECT.” Most databases support it but it is fragile — if you reorder your SELECT columns the grouping changes silently with no error.
-- Fragile: GROUP BY 1 means GROUP BY country today,
-- but if you add a column before it tomorrow it breaks silently
SELECT country, SUM(revenue) FROM orders GROUP BY 1;