How do you apply multiple aggregation functions to different columns in a single GroupBy call?
Pass a dictionary to agg() mapping each column to one or more functions, or use named aggregations with the keyword-argument form (pandas 0.25+) to control output column names directly. Both approaches avoid chained GroupBy calls and produce a clean, single-pass result.
How to think about it
What the interviewer is checking
This is a practical question about pandas ergonomics. The interviewer wants to see that you can get clean, named output columns from a single GroupBy call — not multiple .groupby(...).agg() chains, and not a mess of MultiIndex columns you have to flatten afterward.
Two syntaxes — know both, prefer named aggregations
Named aggregations (pandas 0.25+) — the preferred form
Pass keyword arguments of the form output_col=(source_col, function). You get explicit column names in the output, no MultiIndex, and it reads like documentation:
result = df.groupby("region").agg(
total_sales = ("sales", "sum"),
avg_sales = ("sales", "mean"),
max_returns = ("returns", "max"),
order_count = ("sales", "count"),
)
The output columns are exactly total_sales, avg_sales, max_returns, order_count — no flattening needed.
Dict-of-lists form — still useful, produces MultiIndex
df.groupby("region").agg({"sales": ["sum", "mean"], "returns": "max"})
# columns: MultiIndex([('sales','sum'), ('sales','mean'), ('returns','max')])
Flatten the MultiIndex when you need flat column names:
result.columns = ["_".join(c) for c in result.columns]
Custom functions in the same call
Named aggregations accept any callable — including lambdas and full functions:
df.groupby("region").agg(
total = ("sales", "sum"),
cv = ("sales", lambda s: s.std() / s.mean()), # coefficient of variation
)
Playground
The key insight
Named aggregations are both more readable and more maintainable. When you read total_sales=("sales", "sum") three months later, it is immediately clear what the column represents and where it came from. With the dict-of-lists form you always have to flatten and rename columns afterward — extra code that adds noise and can break if the aggregation list changes.
Use the dict form only when you are dynamically building the aggregation spec programmatically (e.g., looping over a list of function names).