What is the difference between loc and iloc in pandas, and when should you use each?
loc selects rows and columns by label (index value or column name), while iloc selects by integer position. Use loc when your index carries meaningful labels like dates or IDs; use iloc for positional slicing regardless of what the index contains.
How to think about it
What the interviewer is listening for
This is a foundational pandas question but the real trap is the default RangeIndex case: when the index happens to be integers 0, 1, 2, ..., loc and iloc look identical — until you sort, filter, or reset the index, and suddenly loc[0] gives you something different than iloc[0]. A strong answer explains the slicing endpoint difference and the RangeIndex gotcha.
The core distinction
loc— label-based. Both endpoints of a slice are inclusive. The “label” is whatever your index contains — could be strings, dates, integers, or anything.iloc— position-based. Follows Python’s standard half-open slice convention:[start, end), end is excluded.
Index: ["alice", "bob", "carol", "dave"]
^-- label for loc ^-- position 3 for iloc (0-indexed)
df.loc["alice":"carol"] → alice, bob, carol (3 rows, both ends included)
df.iloc[0:3] → alice, bob, carol (3 rows, end excluded)
df.loc["alice":"dave"] → all 4 rows
df.iloc[0:4] → all 4 rows (4 excluded = beyond last)
Two playgrounds — labeled index, then integer index
The first playground uses a named index so the difference is obvious. The second shows the dangerous RangeIndex ambiguity.
The RangeIndex off-by-one trap
Quick reference
loc | iloc | |
|---|---|---|
| Selector type | label | integer position |
| Slice end | inclusive | exclusive |
| Works with boolean arrays | yes | yes |
| RangeIndex behavior | label = integer, but slice is still inclusive | purely positional |