datarekha
Python Easy Asked at AirbnbAsked at LyftAsked at Stripe

How do you group a list of items by a property using defaultdict, and when would you use it over a plain dict?

The short answer

collections.defaultdict eliminates the boilerplate of checking whether a key exists before appending, making grouping logic cleaner and less error-prone. It runs in O(n) and is the standard pattern before reaching for itertools.groupby or pandas groupby.

How to think about it

The approach

Grouping is a three-line loop: for each item, look up the group key, append the item. The only friction in a plain dict is the “first time we see this key” check. defaultdict removes that check by auto-initialising missing keys with a factory function — list, int, set, or any zero-argument callable.

This is the standard O(n) grouping pattern you reach for before pulling in pandas.

Solution

Plain dict vs defaultdict at a glance

defaultdict(list):   d[key].append(val)          -- 1 line, no guard
plain dict:          if key not in d: d[key] = []
                     d[key].append(val)           -- 3 lines

Convert to a plain dict with dict(by_customer) once grouping is done if you want standard KeyError behaviour downstream — this prevents phantom keys from accidental lookups later.

The key insight

defaultdict is really just a dict subclass that calls __missing__ with the factory when a key is absent. It doesn’t change performance — both are O(1) per insertion — it just eliminates the boilerplate guard that clutters grouping loops.

Learn it properly Dictionaries

Keep practising

All Python questions

Explore further

Skip to content