Implement binary search correctly — and explain the off-by-one traps.

Binary search halves the search space each iteration to find a target in O(log n). The tricky part is not the idea but the boundary conditions: closed vs. half-open intervals, how to update lo/hi, and when to use lo < hi vs. lo <= hi. One clean template eliminates all the classic bugs.

How does Python's dict work internally, and what makes a good hash key?

CPython dicts are open-addressing hash tables. On lookup, Python calls __hash__ on the key to find a slot, then uses __eq__ to confirm the match. A valid dict key must be hashable — immutable by convention — and two objects that compare equal must have the same hash. Hash collisions are resolved by probing, which is why worst-case lookup degrades from O(1) to O(n).

Group a list of words into anagrams.

Sort each word's characters to get a canonical key, then bucket words by that key using a hash map. This turns an O(n²) brute-force comparison into a clean O(n · k log k) single pass, where k is the max word length.

Find the length of the longest substring without repeating characters.

Use a variable-size sliding window with a hash map that records the most recent index of each character. When the right pointer hits a character already in the window, jump the left pointer to one past that character's last position — skipping over the repeat in one move rather than crawling one step at a time.

Hash Tables & Linear Probing — GATE DA

Q: How does Python's dict work internally, and what makes a good hash key?

CPython dicts are open-addressing hash tables. On lookup, Python calls __hash__ on the key to find a slot, then uses __eq__ to confirm the match. A valid dict key must be hashable — immutable by convention — and two objects that compare equal must have the same hash. Hash collisions are resolved by probing, which is why worst-case lookup degrades from O(1) to O(n).

Q: Group a list of words into anagrams.

Sort each word's characters to get a canonical key, then bucket words by that key using a hash map. This turns an O(n²) brute-force comparison into a clean O(n · k log k) single pass, where k is the max word length.

Q: Find the length of the longest substring without repeating characters.

Use a variable-size sliding window with a hash map that records the most recent index of each character. When the right pointer hits a character already in the window, jump the left pointer to one past that character's last position — skipping over the repeat in one move rather than crawling one step at a time.

What you'll learn

A hash table maps keys to slots via a hash function for average O(1) lookup

Collisions resolved two ways: chaining (a list per slot) vs open addressing

Linear probing tries slots h, h+1, h+2, … mod m until one is empty

Load factor alpha = n/m, and expected unsuccessful-search probes grow like 1/(1 - alpha)

The last lesson ended on a wish: a structure that does not search for a value but computes where it lives, jumping straight there in O(1). That structure exists, and the trick is almost suspiciously simple. A dictionary lookup does not search and does not loop — it computes the address. A hash table runs a hash function on the key to get a slot number, then goes straight there. Store and retrieve both cost one computation plus one slot read, so lookup is O(1) on average, whether the table holds ten keys or ten million. It is the engine under Python’s dict, every GROUP BY and hash join in a database, and the deduplication step in most data pipelines — which is why GATE keeps testing it.

The catch: two different keys can land on the same slot — a collision. How you resolve collisions is the whole subject, and it is exactly what GATE drills.

Hashing into slots

Pick a table of m slots and a hash function h(key) that returns an index in 0 … m−1. A typical exam hash is arithmetic, e.g. h(x) = 3x mod 10 for a size-10 table. Insert a key by computing its slot; look it up by recomputing the same slot. Deterministic and direct — the key tells you exactly where to go.

When h sends two keys to the same slot, you need a collision strategy. There are two classic families.

Chaining vs open addressing

Chaining keeps a list per slot; linear probing keeps everything in the array, walking forward to the next empty slot.

Chaining — each slot holds a list (linked list or small array) of all keys that hash there. Insert appends to the list; lookup hashes to the slot, then scans its list. Average lookup cost is O(1 + alpha) where alpha is the load factor below.
Open addressing — every key lives in the array itself. On a collision you probe for another slot. Linear probing tries the next slots in order: h, h+1, h+2, … (mod m) until an empty one appears. Lookup follows the same path until it finds the key or hits an empty slot.

The load factor is alpha = n / m (items over slots). Chaining tolerates alpha > 1; open addressing cannot exceed alpha = 1 (the array fills up) and slows sharply well before that.

How GATE asks this

The signature question is a NAT: a small table, an arithmetic hash, linear probing, and a fixed insertion order — “which slot does key k end up in?” You trace the probe sequence by hand. GATE DA has asked this in both 2024 and 2025. A second NAT flavour gives a load factor and asks for the expected number of probes, using the open-addressing estimate 1/(1 − alpha) (asked 2024).

Worked example — a real GATE DA 2025 question

Table size m = 10, hash h(x) = 3x mod 10, collisions resolved by linear probing. Insert 1, 4, 5, 6, 14, 15 in this order. Where do 14 and 15 land?

Compute each home slot, and when it is occupied, step forward +1 (mod 10) until empty:

h(1)  = 3·1  mod 10 = 3   → slot 3 empty   ⇒ slot 3
h(4)  = 3·4  mod 10 = 2   → slot 2 empty   ⇒ slot 2
h(5)  = 3·5  mod 10 = 5   → slot 5 empty   ⇒ slot 5
h(6)  = 3·6  mod 10 = 8   → slot 8 empty   ⇒ slot 8

h(14) = 3·14 mod 10 = 2   → slot 2 holds 4   (collision)
                          → probe 3 holds 1   (collision)
                          → probe 4 empty     ⇒ slot 4
h(15) = 3·15 mod 10 = 5   → slot 5 holds 5    (collision)
                          → probe 6 empty     ⇒ slot 6

So 14 lands in slot 4 and 15 lands in slot 6 — the answer to the 2025 question. The final table:

Linear probing fills the array; displaced keys sit just past their home slot.

For the expected-probe variant (2024): an insertion behaves like an unsuccessful search — it probes until it hits an empty slot — so its expected probe count is about 1/(1 − alpha). At alpha = 0.5 that is 1/(1 − 0.5) = 2 probes; at alpha = 0.9 it explodes to 10.

A question to carry forward

Hash tables buy that O(1) lookup with a quiet sacrifice: they throw away all sense of order. The hash function deliberately scatters keys, so the slots hold them in no meaningful sequence at all. Ask a hash table for its smallest key, or to list keys in sorted order, or for everything between 40 and 60, and it simply has no idea — it would have to scan every slot. Yet plenty of problems need fast lookup and a sense of order at once. Here is the thread onward: what shape organises data so that it branches hierarchically — keeping a structure you can search and walk in order — and how do you visit every element of it systematically?

In one breath

A hash table computes a slot with h(key) and goes straight there — O(1) average lookup, no searching. Engine behind dict, GROUP BY, dedup.
A collision (two keys, one slot) is resolved by chaining (a list per slot, tolerates alpha > 1) or open addressing (keys live in the array; alpha ≤ 1).
Linear probing: on a collision try h, h+1, h+2, … (mod m) until empty. GATE traces this by hand (h(x)=3x mod 10, insert order → which slot? — 2024 & 2025).
Load factor alpha = n/m; expected probes for an insertion / unsuccessful search ≈ 1/(1 − alpha) (2 at alpha=0.5, 10 at 0.9).
Primary clustering lengthens runs; use the unsuccessful-search probe figure for insertions, not the successful one.

Practice

Quick check

0/6

Q1Recall: which statements about open addressing with linear probing are TRUE? (select all that apply)select all that apply

Q2Recall: compared with open addressing, separate chaining:

Q3Trace: table size 10, h(x) = 3x mod 10, linear probing. After inserting 1, 4, 5, 6, 14 in order, which slot holds key 14? (slot index)numerical answer — type a number

Q4Trace: same table (size 10, h(x) = 3x mod 10, linear probing) with 1, 4, 5, 6, 14 already placed. Insert 15 next. Which slot does 15 occupy? (slot index)numerical answer — type a number

Q5Apply: a hash table uses linear probing at load factor alpha = 0.5. Using the estimate 1/(1 − alpha), the expected number of probes for an unsuccessful search is:numerical answer — type a number

Q6Create: a size-7 table uses h(x) = x mod 7 with linear probing. Insert 10, 17, 3 in order. Which slot holds 17? (slot index)numerical answer — type a number

Hash Tables & Linear Probing

What you'll learn

Before you start

Hashing into slots

Chaining vs open addressing

How GATE asks this

Worked example — a real GATE DA 2025 question

A question to carry forward

In one breath

Practice

Quick check

Sign in to track your progress

Practice this in an interview

Related lessons

Explore further