Rank, Nullity & Solution Sets
Rank counts independent rows; nullity counts the free directions. Their sum equals the number of columns — and that single identity classifies every solution set.
What you'll learn
- Rank = number of independent rows/columns = number of pivots
- Nullity = dimension of the null space (solutions of Ax = 0)
- The rank-nullity theorem: rank + nullity = number of columns (n)
- Classifying Ax = b as unique, infinitely many, or no solution via rank
Before you start
Last lesson, you had to crunch through elimination before you could say whether a system had one answer, infinitely many, or none. There’s a faster way: count. The rank of a matrix is the number of genuinely independent rows it has — equivalently, the number of pivots elimination will eventually produce. Anything that’s a combination of other rows is redundant and doesn’t add to the count. In data terms, the rank of your data matrix is how many features are genuinely independent — a rank below the column count is exactly the collinearity that destabilises a regression.
Its partner is nullity — the dimension of the null space, which is just
the set of x that solve Ax = 0. If only x = 0 works, the nullity is 0.
If there’s a direction you can slide along and stay at zero, that’s one unit of
nullity; another such direction adds one more. Nullity is simply how many free
variables the system has.
The rank-nullity theorem
For any m by n matrix A (that is, n columns), the two counts always add up to
the number of columns:
So rank + nullity = n, where n is the number of columns. Each of the n
columns is either a pivot column (counted by rank) or a free column (counted by
nullity) — there is no third option, which is why the identity is exact.
That single equation classifies Ax = b. Let r be the rank of A (the rank of the
coefficient matrix), and assume the system is consistent (b is reachable — the
augmented matrix [A | b] has the same rank as A, so no 0 = nonzero row):
- Unique solution when
r = n(full column rank) and consistent. No free variables, so nullity is 0 — one point. - Infinitely many when consistent with
r < n. There aren - rfree variables, a whole family of solutions. - No solution when inconsistent —
rank[A | b] > rank A(a contradiction row).
One consequence falls straight out: for a wide matrix (more columns than rows,
m < n), the rank can be at most m, so r <= m < n. That forces nullity
= n - r > 0, meaning Ax = 0 always has nonzero solutions. More unknowns than
equations can never pin down a single point.
How GATE asks this
This recurs as an MSQ — “select all true statements about the solution set” — and as
a NAT asking you to compute a rank or a nullity. Both 2024 and 2025 leaned on
classifying solution sets via rank. The drill: find rank A (count pivots), check
consistency against rank[A | b], then read off unique (r = n), infinite (r < n,
consistent), or none (inconsistent). For a NAT, nullity is almost always
n - rank straight from the theorem.
Worked example
A 3 by 3 matrix of rank 2. Take
A = [ 1 2 3 ]
[ 2 4 6 ] R2 = 2·R1, R3 = R1 + R2
[ 3 6 9 ]
Every row is a multiple of [1 2 3], so only one row is independent… but let’s be
careful: row-reduce and you get a single pivot.
[ 1 2 3 ] R2 -> R2 - 2R1, R3 -> R3 - 3R1 [ 1 2 3 ]
[ 2 4 6 ] -------------------------------> [ 0 0 0 ]
[ 3 6 9 ] [ 0 0 0 ]
That is rank 1, not 2. To get rank 2 we need a second independent row, e.g.
B = [ 1 2 3 ] -> echelon -> [ 1 2 3 ]
[ 0 1 4 ] [ 0 1 4 ]
[ 2 5 10] [ 0 0 0 ]
Now there are two pivots, so rank B = 2. With n = 3 columns, the theorem gives
nullity = n - rank = 3 - 2 = 1
So Bx = 0 has a 1-parameter family of solutions — set the one free variable to
t and every solution is a multiple of a single direction. Infinitely many, with one
degree of freedom. (This mirrors the 2024/2025 questions that hand you a matrix and ask
for the dimension of its solution set.)
Two equations, three unknowns (m < n). A system with 2 equations in x, y, z has
a 2 by 3 coefficient matrix, so rank <= 2 < 3 = n. Therefore nullity is at least
3 - 2 = 1: it can never have a unique solution — only infinitely many (if
consistent) or none (if not). More unknowns than equations rules out a single answer.
Quick check
Quick check
Practice this in an interview
All questionsROW_NUMBER assigns a unique sequential integer to every row regardless of ties. RANK assigns the same number to tied rows but skips subsequent positions. DENSE_RANK also assigns the same number to ties but never skips positions.
SQL uses three-valued logic: comparing any value to NULL yields UNKNOWN, not FALSE. NOT IN evaluates as NOT (a = v1 OR a = v2 OR ...), so a single NULL in the list makes the entire predicate UNKNOWN for every row, suppressing all results. Use NOT EXISTS or a LEFT JOIN anti-pattern instead.
COUNT(*) counts every row including those with NULLs. COUNT(column) counts only rows where that column is non-NULL. COUNT(DISTINCT column) counts unique non-NULL values in the column.
NULL represents an unknown value. Comparing anything to NULL with = produces NULL (not TRUE or FALSE), and WHERE only passes rows where the condition evaluates to TRUE. The correct syntax is IS NULL or IS NOT NULL.