How do GroupBy and multi-index interact in pandas, and how do you flatten a multi-index result?
Grouping on multiple keys produces a MultiIndex on the result by default. You can suppress it with as_index=False or groupby(..., as_index=False), or reset it afterward with reset_index(). Stacking and unstacking let you pivot between long and wide forms once a MultiIndex exists.
How to think about it
What the interviewer is really testing
This question checks whether you understand what pandas is giving you back after a multi-key groupby — not just the mechanics, but the structure of the result. A lot of candidates can write groupby(["a", "b"]) but then get stuck when they need to join the aggregated result back to another table. Knowing when to flatten, and which tool to reach for, is what separates fluency from cargo-culting.
Why a MultiIndex appears
When you group on more than one column, pandas uses the grouping columns as a hierarchical row index. This is actually useful — you can slice entire sub-trees with .loc — but it can be confusing until you see it in action.
Think of it like a filing cabinet: the top drawer is labeled “2023”, and inside it there are folders for “East” and “West”. result.loc[2023] opens the top drawer; result.loc[(2023, "East")] pulls one specific folder.
The three flattening strategies
There are three paths out of a MultiIndex, and picking the right one saves you an extra step later:
reset_index()— after the fact, turn the index levels into regular columns. Most common; works anywhere.as_index=False— tellgroupbynever to create a MultiIndex in the first place. Cleanest when you know upfront you want a flat DataFrame.unstack()— pivot one level of the MultiIndex into column headers. This is the “wide format” escape hatch.
Try it in the playground
The key insight: it’s still a regular index
A MultiIndex is just a pandas Index with multiple levels. Everything you know about .loc still applies — the only difference is that you pass a tuple to identify a specific cell, and you can pass a partial key (just the outer level) to get a whole sub-group.
result.loc[(2023, "East")] # single scalar: 100
result.loc[2023] # entire 2023 subtree (Series)
result.xs("East", level="region") # all years for East
xs is particularly handy when you want to slice on an inner level without knowing or repeating the outer level values.