datarekha
Python Medium Asked at GoogleAsked at AmazonAsked at MetaAsked at Microsoft

How does Python's dict work internally, and what makes a good hash key?

The short answer

CPython dicts are open-addressing hash tables. On lookup, Python calls __hash__ on the key to find a slot, then uses __eq__ to confirm the match. A valid dict key must be hashable — immutable by convention — and two objects that compare equal must have the same hash. Hash collisions are resolved by probing, which is why worst-case lookup degrades from O(1) to O(n).

How to think about it

How I think about this question

This one is really testing whether you understand the two-step lookup that underpins all of Python’s hash-based data structures. Get that mental model right and the rest follows naturally — valid key requirements, collision behavior, why mutable objects can’t be keys.

The lookup path

Every dict operation — d[key], key in d, d[key] = val — follows the same sequence:

  1. slot = hash(key) % table_size — compute the initial slot index
  2. If table[slot] is empty → key not present
  3. If table[slot].key == key → found
  4. Otherwise → probe to the next slot (CPython uses perturbation probing) and repeat

CPython 3.6+ also keeps a separate compact insertion-order array alongside the hash table, which is why dicts preserve insertion order while still delivering O(1) lookups.

What makes a valid key

For a type to work as a dict key it must implement __hash__ (returns a stable integer) and __eq__ (compares for equality). Strings, ints, tuples of hashable items, and frozensets are all valid. Lists and dicts are not — they’re mutable, so their “identity” can change.

Try it yourself

The playground below lets you see which types are hashable, check that equal objects share a hash, and watch what happens when you define __eq__ without __hash__:

The key insight: why equal objects must share a hash

If a == b but hash(a) != hash(b), the lookup for b would probe a completely different slot from where a was stored. The dict would say “key not found” even though an equal key exists. Python cannot verify this invariant for you — it trusts that you’ve implemented __hash__ consistently with __eq__.

Average vs worst-case complexity

OperationAverageWorst case
Lookup / insert / deleteO(1)O(n) — all keys collide

Worst case is pathological. Python randomises hash seeds at startup (PYTHONHASHSEED) to prevent hash-flooding DoS attacks where an attacker crafts inputs that all land in the same slot.

Learn it properly Dictionaries

Keep practising

All Python questions

Explore further

Skip to content