Pandas & Data Wrangling Easy Asked at AmazonAsked at WalmartAsked at Netflix

How do you apply multiple aggregation functions to different columns in a single GroupBy call?

For Data Analyst Data Scientist Data Engineer

The short answer

Pass a dictionary to agg() mapping each column to one or more functions, or use named aggregations with the keyword-argument form (pandas 0.25+) to control output column names directly. Both approaches avoid chained GroupBy calls and produce a clean, single-pass result.

How to think about it

This is a practical pandas-ergonomics question: can you get clean, named output columns from one GroupBy call — not chained .agg()s, not a MultiIndex you have to flatten by hand? The preferred tool is named aggregations (pandas 0.25+): keyword arguments of the form output_col=(source_col, function), which read like documentation and name the columns directly.

A worked example — named aggregations

import pandas as pd

df = pd.DataFrame({"region": ["East", "East", "East", "West", "West", "North"],
                   "sales": [100, 200, 150, 50, 300, 80],
                   "returns": [5, 10, 3, 7, 15, 2],
                   "qty": [3, 5, 4, 1, 8, 2]})

result = df.groupby("region").agg(
    total_sales = ("sales",   "sum"),
    avg_sales   = ("sales",   "mean"),
    max_returns = ("returns", "max"),
    order_count = ("sales",   "count"),
    cv_sales    = ("sales",   lambda s: round(s.std() / s.mean(), 3)),
)
print(result)

        total_sales  avg_sales  max_returns  order_count  cv_sales
region
East            450      150.0           10            3     0.333
West            350      175.0           15            2     1.010
North            80       80.0            2            1       NaN

The output columns are exactly the names you typed — total_sales, cv_sales, … — no MultiIndex, no flattening. Each pulls from a stated source column with its own function, and the form even takes a lambda (cv_sales is the coefficient of variation). Note North’s cv_sales is NaN: with a single row, std() is undefined — a real edge the data surfaces honestly. Contrast the older dict-of-lists form, which works but hands you a MultiIndex to clean up:

multi = df.groupby("region").agg({"sales": ["sum", "mean"], "returns": "max"})
multi.columns = ["_".join(c) for c in multi.columns]   # flatten
print(multi)

        sales_sum  sales_mean  returns_max
region
East          450       150.0           10
West          350       175.0           15
North          80        80.0            2

Same numbers, but you paid an extra "_".join step to get flat names. Reach for the dict form only when you’re building the aggregation spec programmatically (looping over a list of functions); otherwise prefer named aggregations for readability you’ll thank yourself for in three months.

Learn it properly GroupBy

How do you apply multiple aggregation functions to different columns in a single GroupBy call?

A worked example — named aggregations

Keep practising

Explore further