How does Python's dict work internally, and what makes a good hash key?
CPython dicts are open-addressing hash tables. On lookup, Python calls __hash__ on the key to find a slot, then uses __eq__ to confirm the match. A valid dict key must be hashable — immutable by convention — and two objects that compare equal must have the same hash. Hash collisions are resolved by probing, which is why worst-case lookup degrades from O(1) to O(n).
How to think about it
How I think about this question
This one is really testing whether you understand the two-step lookup that underpins all of Python’s hash-based data structures. Get that mental model right and the rest follows naturally — valid key requirements, collision behavior, why mutable objects can’t be keys.
The lookup path
Every dict operation — d[key], key in d, d[key] = val — follows the same sequence:
slot = hash(key) % table_size— compute the initial slot index- If
table[slot]is empty → key not present - If
table[slot].key == key→ found - Otherwise → probe to the next slot (CPython uses perturbation probing) and repeat
CPython 3.6+ also keeps a separate compact insertion-order array alongside the hash table, which is why dicts preserve insertion order while still delivering O(1) lookups.
What makes a valid key
For a type to work as a dict key it must implement __hash__ (returns a stable integer) and __eq__ (compares for equality). Strings, ints, tuples of hashable items, and frozensets are all valid. Lists and dicts are not — they’re mutable, so their “identity” can change.
Try it yourself
The playground below lets you see which types are hashable, check that equal objects share a hash, and watch what happens when you define __eq__ without __hash__:
The key insight: why equal objects must share a hash
If a == b but hash(a) != hash(b), the lookup for b would probe a completely different slot from where a was stored. The dict would say “key not found” even though an equal key exists. Python cannot verify this invariant for you — it trusts that you’ve implemented __hash__ consistently with __eq__.
Average vs worst-case complexity
| Operation | Average | Worst case |
|---|---|---|
| Lookup / insert / delete | O(1) | O(n) — all keys collide |
Worst case is pathological. Python randomises hash seeds at startup (PYTHONHASHSEED) to prevent hash-flooding DoS attacks where an attacker crafts inputs that all land in the same slot.