How do you group a list of items by a property using defaultdict, and when would you use it over a plain dict?
collections.defaultdict eliminates the boilerplate of checking whether a key exists before appending, making grouping logic cleaner and less error-prone. It runs in O(n) and is the standard pattern before reaching for itertools.groupby or pandas groupby.
How to think about it
The approach
Grouping is a three-line loop: for each item, look up the group key, append the item. The only friction in a plain dict is the “first time we see this key” check. defaultdict removes that check by auto-initialising missing keys with a factory function — list, int, set, or any zero-argument callable.
This is the standard O(n) grouping pattern you reach for before pulling in pandas.
Solution
Plain dict vs defaultdict at a glance
defaultdict(list): d[key].append(val) -- 1 line, no guard
plain dict: if key not in d: d[key] = []
d[key].append(val) -- 3 lines
Convert to a plain dict with dict(by_customer) once grouping is done if you want standard KeyError behaviour downstream — this prevents phantom keys from accidental lookups later.
The key insight
defaultdict is really just a dict subclass that calls __missing__ with the factory when a key is absent. It doesn’t change performance — both are O(1) per insertion — it just eliminates the boilerplate guard that clutters grouping loops.