Pandas & Data Wrangling Medium Asked at BloombergAsked at Two SigmaAsked at Databricks

How do GroupBy and multi-index interact in pandas, and how do you flatten a multi-index result?

For Data Analyst Data Scientist Data Engineer

The short answer

Grouping on multiple keys produces a MultiIndex on the result by default. You can suppress it with as_index=False or groupby(..., as_index=False), or reset it afterward with reset_index(). Stacking and unstacking let you pivot between long and wide forms once a MultiIndex exists.

How to think about it

This checks whether you understand what pandas hands back after a multi-key groupby — not just the call, but the structure of the result. Plenty of candidates write groupby(["a", "b"]) and then stall when they need to join the aggregate back to another table. Knowing when to flatten, and which tool to reach for, is the difference between fluency and cargo-culting.

When you group on more than one column, pandas uses the grouping columns as a hierarchical row index. There are three ways out of it: reset_index() (turn index levels into columns, after the fact), as_index=False (never build the MultiIndex in the first place), and unstack() (pivot an inner level into column headers — the wide-format escape hatch).

A worked example — the same aggregate, three shapes

import pandas as pd

df = pd.DataFrame({"year": [2023, 2023, 2024, 2024, 2024],
                   "region": ["East", "West", "East", "West", "East"],
                   "sales": [100, 200, 150, 250, 180]})

result = df.groupby(["year", "region"])["sales"].sum()
print(result)

year  region
2023  East      100
      West      200
2024  East      330
      West      250
Name: sales, dtype: int64

That’s the default MultiIndex — note (2024, East) is 330, the 150 and 180 summed. The hierarchy is genuinely useful (result.loc[2024] opens the whole 2024 subtree, result.loc[(2024, "East")] pulls one cell, result.xs("East", level="region") slices an inner level), but to join it onward you usually flatten:

print(result.reset_index())        # levels -> columns
print(result.unstack("region"))    # inner level -> column headers

   year region  sales
0  2023   East    100
1  2023   West    200
2  2024   East    330
3  2024   West    250
region  East  West
year
2023     100   200
2024     330   250

reset_index() gives a flat tidy DataFrame ready to merge; unstack("region") pivots region into columns for a compact year × region matrix. groupby(..., as_index=False) produces the same flat result as reset_index() directly — pick it when you know upfront you want columns, not an index.

Learn it properly GroupBy

How do GroupBy and multi-index interact in pandas, and how do you flatten a multi-index result?

A worked example — the same aggregate, three shapes

Keep practising

Explore further