datarekha
SQL Hard Asked at GoogleAsked at AmazonAsked at Microsoft

What is an anti-join, and when does NOT IN behave differently from NOT EXISTS?

The short answer

An anti-join returns rows from the left table that have no match in the right table. NOT IN and NOT EXISTS both express this, but NOT IN returns zero rows — not an error, just silently empty — when the subquery contains even a single NULL, because SQL cannot determine whether the outer value equals a NULL.

How to think about it

This is a Hard question because the interviewer wants to catch something very specific: do you know that NOT IN silently returns zero rows when NULLs are present? Most candidates know what an anti-join is. The ones who pass this question can explain why NOT IN breaks and demonstrate they reach for NOT EXISTS by default.

The three anti-join patterns

There are three ways to write an anti-join, and they are not equivalent when NULLs are involved:

-- Pattern 1: NOT IN (dangerous with NULLs)
SELECT id FROM orders
WHERE customer_id NOT IN (SELECT id FROM vip_customers);

-- Pattern 2: NOT EXISTS (NULL-safe, recommended)
SELECT o.id FROM orders o
WHERE NOT EXISTS (
  SELECT 1 FROM vip_customers v WHERE v.id = o.customer_id
);

-- Pattern 3: LEFT JOIN ... IS NULL (NULL-safe, often fastest)
SELECT o.id
FROM orders o
LEFT JOIN vip_customers v ON v.id = o.customer_id
WHERE v.id IS NULL;

Why NOT IN breaks on NULLs

SQL evaluates x NOT IN (1, 2, NULL) as x != 1 AND x != 2 AND x != NULL. The last comparison evaluates to UNKNOWN — not FALSE — because any comparison to NULL is unknown in SQL’s three-valued logic. UNKNOWN in a WHERE clause suppresses the row. If any row in the subquery has a NULL, the entire NOT IN returns empty.

This is the playground to make that concrete. The vip_customers table has a NULL id on the last row. Watch what happens:

The key insight: three-valued logic

-- This returns ZERO rows even though 5 is clearly not in the list
SELECT 1 WHERE 5 NOT IN (1, 2, NULL);

SQL doesn’t know whether 5 equals NULL — it returns UNKNOWN, and UNKNOWN rows are excluded. The moment one NULL creeps into the subquery, all of NOT IN’s results are suppressed.

When to use each

PatternUse when
NOT INSubquery column is declared NOT NULL and list is small/static
NOT EXISTSSubquery column is nullable; complex correlated condition
LEFT JOIN ... IS NULLYou also need columns from the right table, or prefer set-based style
A onlyB
Anti-join returns only the left-exclusive region — rows in A that have no counterpart in B.
Learn it properly Anti-joins

Keep practising

All SQL questions

Explore further

Skip to content