What is the difference between pivot, pivot_table, and melt in pandas, and when do you use each?
pivot reshapes long-format data to wide by spreading a column's values into new column headers — it requires unique index/column combinations and has no aggregation. pivot_table is the aggregating version that handles duplicates via a specified aggfunc. melt is the inverse: it takes wide-format data and collapses multiple columns into key-value rows (long format).
How to think about it
How to frame your answer
The interviewer is testing whether you understand data shape as a deliberate choice, not just a side-effect of how data arrives. A strong answer explains the conceptual difference between long (tidy) and wide format, then maps each function to the right scenario. Bonus points if you proactively mention that pivot will raise on duplicates — that signals you’ve hit the error in practice.
The mental model: two directions of reshaping
Think of a spreadsheet with months as column headers. That is wide format — one row per city, one column per month. Long format is the opposite — one row per (city, month) pair. pivot and pivot_table go from long → wide; melt goes wide → long.
Long (tidy) Wide
city month sales city jan feb
NYC jan 3 → NYC 3 5
NYC feb 5 LA 9 8
LA jan 9
LA feb 8
Playground: melt and pivot in action
When each tool applies
pivot — use when you are certain every (index, column) combination is unique. It is strict by design; if it raises ValueError, that is a signal your data has unexpected duplicates.
pivot_table — use when duplicates are possible or expected, or when you want an aggregate (sum, mean, count) across duplicates. It is the safer default for real-world data.
# Handles duplicates, fills missing combinations with 0
wide_agg = long.pivot_table(
index="city",
columns="month",
values="sales",
aggfunc="sum",
fill_value=0,
)
melt — use to normalize wide spreadsheet or survey data into tidy format before analysis. Most groupby, visualization, and ML workflows expect long format.