What is an anti-join, how do you implement one in SQL, and which implementation is most reliable?
An anti-join returns rows from the left table that have no matching row in the right table — the inverse of a semi-join. The three implementations are NOT EXISTS, NOT IN, and a LEFT JOIN with a NULL filter; NOT EXISTS is the most reliable because it is NULL-safe and communicates intent clearly.
How to think about it
The classic real-world question behind this is: “find customers who have never placed an order.” The anti-join is the pattern for any “find rows that have no counterpart” problem. The interviewer wants to see that you can write it and that you know which form is safe.
Pattern 1 — NOT EXISTS (recommended)
This is the go-to. It reads like English, short-circuits at the first match, and handles NULLs correctly because the correlation happens in the WHERE clause, not via equality comparison to NULL.
SELECT c.customer_id, c.name
FROM customers c
WHERE NOT EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_id = c.customer_id
);
SELECT 1 is a convention — any expression works; only the existence of a matching row is checked, not its value.
Pattern 2 — LEFT JOIN with IS NULL
Equivalent to NOT EXISTS, produces the same execution plan in most modern optimizers. Preferred when you want to visually inspect what’s missing or when you need other columns from the right table for debugging.
SELECT c.customer_id, c.name
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.customer_id IS NULL;
Pattern 3 — NOT IN (use with caution)
Safe only when the subquery column has a NOT NULL constraint. One NULL in the subquery result silently empties the entire output.
-- Fragile if orders.customer_id can be NULL
SELECT customer_id, name
FROM customers
WHERE customer_id NOT IN (SELECT customer_id FROM orders);