What are 1NF, 2NF, and 3NF, and when would you intentionally denormalize?
1NF eliminates repeating groups and requires atomic column values. 2NF further removes partial dependencies on a composite key. 3NF removes transitive dependencies — every non-key column must depend on the key, the whole key, and nothing but the key. Denormalization trades update anomalies for read performance, and is appropriate when the read path dominates and write correctness can be enforced at the application layer or with materialized views.
How to think about it
What the question is really testing
This is a design philosophy question. The interviewer wants to know whether you understand why normal forms exist — to eliminate redundancy and protect data integrity — and whether you can make a pragmatic call about when breaking those rules is the right trade-off.
The three normal forms
The one-line mnemonic that covers all three: “The key, the whole key, and nothing but the key.”
1NF — Atomic values, no repeating groups
Every cell holds exactly one value. No comma-separated lists, no JSON blobs pretending to be columns.
-- Violates 1NF: multiple tags crammed into one column
CREATE TABLE articles (
id INT PRIMARY KEY,
title TEXT,
tags TEXT -- 'sql,performance,indexing' — not atomic
);
-- 1NF compliant: one value per cell
CREATE TABLE article_tags (
article_id INT REFERENCES articles(id),
tag TEXT,
PRIMARY KEY (article_id, tag)
);
2NF — No partial dependencies (applies only to composite keys)
If the primary key is composite (two or more columns), every non-key column must depend on the whole key, not just part of it.
-- Violates 2NF: composite PK is (order_id, product_id)
-- but product_name depends only on product_id — a partial dependency
CREATE TABLE order_items (
order_id INT,
product_id INT,
product_name TEXT, -- partial dependency: only needs product_id
quantity INT,
PRIMARY KEY (order_id, product_id)
);
-- 2NF fix: move product_name to its own table
CREATE TABLE products (id INT PRIMARY KEY, name TEXT);
CREATE TABLE order_items (
order_id INT, product_id INT REFERENCES products(id),
quantity INT,
PRIMARY KEY (order_id, product_id)
);
3NF — No transitive dependencies
Non-key columns must depend directly on the primary key — not on another non-key column.
-- Violates 3NF: city depends on zip_code, not directly on employee id
CREATE TABLE employees (
id INT PRIMARY KEY, name TEXT, zip_code CHAR(5), city TEXT
);
-- 3NF fix: extract the transitive dependency
CREATE TABLE zip_codes (zip CHAR(5) PRIMARY KEY, city TEXT);
CREATE TABLE employees (
id INT PRIMARY KEY, name TEXT, zip_code CHAR(5) REFERENCES zip_codes(zip)
);
When to intentionally denormalize
Normalization protects write correctness. Denormalization optimizes read throughput. The trade-off is explicit:
| Scenario | Denormalization approach |
|---|---|
| Reporting / analytics (OLAP) | Flatten fact + dimension joins into a wide table or materialized view |
| High-read, low-write lookups | Store precomputed full_name = first_name || ' ' || last_name |
| Event streams | Embed a denormalized snapshot at write time to avoid joins at query time |
| Star schema | Dimension tables are intentionally denormalized (no 3NF) for query simplicity |
Interview framing
Lead with the mnemonic: “The key, the whole key, and nothing but the key.” Then show the trade-off: normalization protects write correctness, denormalization optimizes read throughput. Production systems routinely use 3NF for OLTP and intentional denormalization for OLAP — the skill is knowing which context you are in.