Python Easy Asked at AirbnbAsked at LyftAsked at Stripe

How do you group a list of items by a property using defaultdict, and when would you use it over a plain dict?

For Data Analyst Data Scientist Data Engineer

The short answer

collections.defaultdict eliminates the boilerplate of checking whether a key exists before appending, making grouping logic cleaner and less error-prone. It runs in O(n) and is the standard pattern before reaching for itertools.groupby or pandas groupby.

How to think about it

Grouping is a three-line loop — for each item, find its group key and append. The only friction with a plain dict is the “have I seen this key before?” check on the first item of each group. defaultdict deletes that check: it auto-initialises a missing key by calling a zero-argument factory — list, int, set, whatever fits. It’s the standard O(n) grouping pattern you reach for before pandas.

A worked example

from collections import defaultdict

orders = [
    {"customer": "Alice", "amount": 30},
    {"customer": "Bob",   "amount": 50},
    {"customer": "Alice", "amount": 20},
    {"customer": "Bob",   "amount": 10},
    {"customer": "Carol", "amount": 75},
]

# Group: missing keys auto-init to [] — no guard needed
by_customer = defaultdict(list)
for o in orders:
    by_customer[o["customer"]].append(o["amount"])
print("Grouped:", dict(by_customer))

# Aggregate the groups
print("Totals :", {c: sum(a) for c, a in by_customer.items()})

# A different factory: defaultdict(int) auto-inits to 0 for counting
counts = defaultdict(int)
for o in orders:
    counts[o["customer"]] += 1
print("Counts :", dict(counts))

Grouped: {'Alice': [30, 20], 'Bob': [50, 10], 'Carol': [75]}
Totals : {'Alice': 50, 'Bob': 60, 'Carol': 75}
Counts : {'Alice': 2, 'Bob': 2, 'Carol': 1}

One idea, two shapes of grouping. defaultdict(list) collected each customer’s amounts, so Alice accrues [30, 20]; defaultdict(int) started every count at 0, so counts[...] += 1 works on the very first sighting with no initialisation. The same trick extends to defaultdict(set) when you want the unique values per group.

Plain dict vs defaultdict

defaultdict(list):   d[key].append(val)            # one line, no guard
plain dict:          if key not in d: d[key] = []  # three lines
                     d[key].append(val)

Performance is identical — both are O(1) per insertion — so the win is purely readability: the guard that clutters every grouping loop disappears. (dict.setdefault(key, []).append(val) does the same on a plain dict, but it re-creates the empty-list argument on every call and reads worse.)

Learn it properly Dictionaries

How do you group a list of items by a property using defaultdict, and when would you use it over a plain dict?

A worked example

Plain dict vs defaultdict

Keep practising

Explore further