datarekha
Pandas & Data Wrangling Medium Asked at AmazonAsked at UberAsked at Airbnb

What is the difference between GroupBy transform and agg in pandas?

The short answer

agg collapses each group into a single scalar, returning a result with one row per group. transform returns a Series or DataFrame with the same index as the original, broadcasting the group-level result back to every row — making it ideal for adding derived columns without a merge.

How to think about it

The output-shape question

The fastest way to answer this in an interview: “agg reduces — you get fewer rows. transform preserves — you get the same number of rows, with the group-level value broadcast to every member of that group.”

In practice, transform saves you from a merge. Without it, the only way to add a group statistic back to the original DataFrame is to agg, then merge on the group key. transform does both in one step.

See the shape difference clearly

Common transform patterns

Because transform is index-aligned, you can create derived columns directly:

# z-score within each department
df["z_score"] = df.groupby("dept")["salary"].transform(
    lambda s: (s - s.mean()) / s.std()
)

# rank within each department (1 = highest earner)
df["dept_rank"] = df.groupby("dept")["salary"].transform("rank", ascending=False)

# percentage of department total
df["pct_dept"] = df["salary"] / df.groupby("dept")["salary"].transform("sum")

When to use which

GoalUse
Summary table — one row per groupagg
New column on original DataFrame (no merge)transform
Drop entire groups based on a conditionfilter
Custom multi-column logic per groupapply (last resort)
Learn it properly GroupBy

Keep practising

All Pandas & Data Wrangling questions

Explore further

Skip to content