What are the core assumptions of linear regression, and what breaks when each is violated?

OLS linear regression rests on five assumptions: linearity, independence of errors, homoscedasticity, normality of residuals, and no perfect multicollinearity. Violating any one of them degrades coefficient estimates, standard errors, or the validity of hypothesis tests.

How does PCA work, and how do you choose the number of components?

PCA finds orthogonal directions (principal components) of maximum variance by computing the eigenvectors of the covariance matrix, then projects data onto the top components. Choose the number of components by the cumulative explained variance ratio (e.g. enough to retain 95%), a scree-plot elbow, or downstream task performance. Always standardize features first, since PCA is variance-driven.

What are the assumptions and limitations of PCA, and when would it hurt your model?

PCA assumes linear relationships, that variance equals importance, and that components should be orthogonal. It can hurt when the predictive signal lives in low-variance directions, when relationships are nonlinear, or when interpretability matters, since components mix original features. It's also sensitive to scaling and outliers and is unsupervised, so it ignores the target.

What is PCA, when should you use it, and what are its key limitations?

PCA finds the orthogonal directions of maximum variance in the data and projects onto a lower-dimensional subspace, reducing features while retaining most information. It is most useful before distance-based models or when training is bottlenecked by dimensionality. Its main limits are loss of interpretability, sensitivity to scale, and an assumption of linear structure.

Independence, Span, Basis & Dimension — GATE DA

Independence, Span, Basis & Dimension

The word that crossed over from probability — independence — now means arrows that each point in a genuinely new direction. From it fall span, basis, and dimension: how few independent arrows it takes to build a whole space, and how to spot one that is secretly redundant.

8 min read Intermediate GATE DA Lesson 20 of 122

What you'll learn

Linear independence: the only solution to Σ cᵢvᵢ = 0 is all cᵢ = 0

Span, basis (an independent spanning set), and dimension (the count of basis vectors)

Orthonormal sets are mutually perpendicular unit vectors, and are automatically independent

A space has MANY valid bases — independence is weaker than orthogonality

Here is the word that crossed the border from probability — independence — now meaning something about arrows. Pick a handful of them. How few can you keep and still reach every point of a space by adding and scaling? An arrow that points in a genuinely new direction earns its place; one that is secretly a stretch of the others is dead weight. That single question opens four tightly linked ideas — independence, span, basis, dimension — and GATE tests them as a cluster.

Independence — no redundant directions

Vectors v₁, …, vₙ are linearly independent when no one of them is a combination of the others. The crisp algebraic test: the only way to make

c₁v₁ + c₂v₂ + … + cₙvₙ = 0

is to take every coefficient cᵢ = 0. If some non-zero choice of coefficients also gives zero, the vectors are dependent — at least one is redundant, addable from the rest.

Independent vectors open up a plane; collinear (dependent) vectors stay stuck on one line.

Drag the two arrows below. Point them different ways and every combination fills out the plane — independence in action. Now lay one on top of (or opposite) the other and the combinations collapse onto a single line — the hallmark of dependence.

Tryvector playground

Drag the arrow tips — watch the dot product change

u(3.0, 4.0)

v(4.0, 1.0)

|u|5.00

|v|4.12

u·v16.00

cos θ0.776

angle39°

Drag the colored tip of each arrow. The dot product spikes when the arrows point the same way and vanishes when they're perpendicular.

Span, basis, dimension

The span of a set of vectors is everything you can build from them by adding and scaling — all their linear combinations. Two independent vectors in R² span the whole plane; one non-zero vector spans only a line.

A basis of a space is a set that is both independent and spanning — just enough arrows to reach everything, with none redundant. The dimension is simply the number of vectors in a basis (every basis of a given space has the same count). So R² has dimension 2, R³ has dimension 3. And a space has many valid bases: any n independent vectors that span an n-dimensional space will do — there is nothing special about the “standard” one.

Orthonormal — perpendicular and unit length

A set is orthonormal when its vectors are mutually perpendicular (every pair has dot product 0) and each has length 1. The standard basis (1,0), (0,1) is orthonormal. A fact GATE tests: an orthonormal set is automatically independent — perpendicular directions can never be combinations of one another. But the converse is weaker: independent vectors need not be perpendicular.

A worked example — independent or dependent?

(1) Are (1, 0) and (1, 1) independent, and do they form a basis of R²? (2) Are (1, 2) and (2, 4) independent?

For two vectors in R², the quickest test is the determinant of the matrix they form — non-zero means independent:

det [1 1]  = 1·1 − 0·1 = 1  ≠ 0   →  (1,0),(1,1) independent → a basis of R²
    [0 1]

det [1 2]  = 1·4 − 2·2 = 0        →  (1,2),(2,4) dependent
    [2 4]

The first pair are independent, and two independent vectors in the 2-dimensional R² automatically span it, so they are a basis — a different basis from the standard (1,0), (0,1). The second pair are dependent because (2, 4) = 2·(1, 2): the second adds no new direction, so they span only a line.

A question to carry forward

For a clutch of arrows we just asked “how many point in genuinely new directions?”. A matrix is exactly a stack of column arrows — so the same question applies to it. Here is the thread onward: how many independent directions does a matrix hold, what do we call that number, and what does it instantly tell us about the equation Ax = b?

In one breath

Independent ⇔ c₁v₁+…+cₙvₙ = 0 forces all cᵢ = 0 (no vector is a combination of the others); a non-zero solution means dependent (a redundant direction).
Span = all linear combinations; basis = independent and spanning; dimension = number of basis vectors (every basis has the same count; Rⁿ → n).
A space has many bases — any n independent vectors that span.
Orthonormal (perpendicular + unit length) ⇒ automatically independent, but independent ⇏ orthogonal ((1,0),(1,1) are independent, not perpendicular).
Fast check for n vectors in Rⁿ: form a matrix — det ≠ 0 ⇔ independent (a basis). Any n+1 vectors in Rⁿ are always dependent.

Practice

Quick check

0/6

Q1Recall: vectors are linearly independent exactly when the equation c₁v₁ + … + cₙvₙ = 0 has…

Q2Trace: how many vectors are in any basis of R³ (its dimension)?numerical answer — type a number

Q3Trace: are (1, 2) and (3, 6) linearly independent?

Q4Apply: which statements are TRUE? (select all that apply)select all that apply

Q5Apply: which sets form a basis of R²? (select all that apply)select all that apply

Q6Create: three non-zero vectors lie in R². What can you conclude about their independence, and why?

Independence, Span, Basis & Dimension

What you'll learn

Before you start

Independence — no redundant directions

Drag the arrow tips — watch the dot product change

Span, basis, dimension

Orthonormal — perpendicular and unit length

A worked example — independent or dependent?

A question to carry forward

In one breath

Practice

Quick check

Sign in to track your progress

Practice this in an interview

Related lessons

Explore further