datarekha
SQL Medium Asked at AmazonAsked at Google

What happens when a join key contains NULLs? Do NULL values ever match in a JOIN?

The short answer

NULL never equals NULL in SQL — join conditions use equality, so rows where either key is NULL are silently excluded from INNER JOIN results and placed in the unmatched set for OUTER JOINs. If you need NULL-to-NULL matching, you must use IS NOT DISTINCT FROM or COALESCE the key to a sentinel value.

How to think about it

This is a deceptively subtle point — and one that bites data engineers in ETL pipelines more than anywhere else. NULL in a join key is not rare: optional foreign keys, improperly loaded dimension tables, and NULL-as-unknown patterns all produce it.

The one-sentence answer that wins the interview: “NULL never equals anything, including another NULL — so NULL-keyed rows are silently excluded from INNER JOINs.”

The default behaviour — watch the row vanish

Run the playground below. Table a has a NULL-keyed row ('y'), and table b has a NULL-keyed row ('beta'). They look like they should match — but they don’t.

Result: only ('x', 'alpha'). The (NULL, 'y') row from a does not match (NULL, 'beta') from b because NULL = NULL evaluates to UNKNOWN — and UNKNOWN is treated as FALSE in a join condition.

LEFT JOIN behaviour with NULL keys

SELECT a.val, b.info
FROM a
LEFT JOIN b ON a.id = b.id;

Result:

  • ('x', 'alpha') — matched
  • ('y', NULL) — left row kept; right side NULL because no match (NULL never matched)

The NULL-keyed row from a is retained by the LEFT JOIN, but its right-side columns are NULL for the same reason as any other unmatched row — not because the key was NULL specifically.

Forcing NULL-to-NULL matching

PostgreSQL / BigQuery / DuckDB:

SELECT a.val, b.info
FROM a
JOIN b ON a.id IS NOT DISTINCT FROM b.id;

IS NOT DISTINCT FROM treats NULL as equal to NULL. Use it deliberately — it changes semantics and can hurt index usage.

Portable workaround:

SELECT a.val, b.info
FROM a
JOIN b ON COALESCE(a.id, -1) = COALESCE(b.id, -1);

Pick a sentinel (-1, '__null__') that cannot appear as a real key value. Document why you did it.

The key insight

There are exactly three options when you encounter NULL join keys, and naming all three is what separates a good answer from a great one:

  1. Accept the data loss — if NULL-keyed rows are genuinely invalid, INNER JOIN is correct.
  2. Use IS NOT DISTINCT FROM — NULL-safe equality in PostgreSQL, BigQuery, DuckDB.
  3. COALESCE to a sentinel — portable workaround; pick a value that cannot be a real key.
Learn it properly NULLs done right

Keep practising

All SQL questions

Explore further

Skip to content