GATE DA 2026 — Solved Walkthrough
A guided walk through representative solved problems from GATE DA 2026 — one per subject, each worked to its verified answer and linked to the lesson that teaches it.
The 2026 paper rewarded the same habit every GATE DA paper does: recognise the kind of question, reach for the one formula or rule that cracks it, and substitute carefully. Below, the problems are grouped by subject so you can see how each area was tested.
Probability & Statistics
A disease affects 30% of a population. A test detects it correctly 80% of the time (sensitivity), with a 10% false-positive rate on healthy people. A person tests positive — what is the probability they actually have the disease?
This is a textbook Bayes inversion: the number you can measure is P(+ | disease), but
the number you want is P(disease | +). Build the evidence denominator by total
probability, then divide.
P(D) = 0.30, P(+|D) = 0.80, P(+|¬D) = 0.10, P(¬D) = 0.70
P(D | +) = 0.80 · 0.30
────────────────────────── = 0.24 / 0.31 ≈ 0.77
0.80·0.30 + 0.10·0.70
Answer: ≈ 0.77. Because the disease was common (30%) to begin with, a positive test is genuinely convincing — base rates dominate, and here the base rate is high.
→ Taught in Bayes’ Theorem
Linear Algebra
Let A be a symmetric matrix with eigenvalues 5 and 2. Find the maximum of the
quadratic form xᵀAx over all unit vectors (‖x‖ = 1).
Write x in A’s orthonormal eigenbasis with coordinates y₁, y₂ where
y₁² + y₂² = 1. Then xᵀAx = 5·y₁² + 2·y₂², a weighted average of the eigenvalues
with weights summing to 1. That average is largest when all the weight sits on the
biggest eigenvalue.
max ‖x‖=1 xᵀAx = largest eigenvalue = 5
(the minimum is the smallest eigenvalue, 2)
Answer: 5. The maximum of a quadratic form on the unit sphere is always the largest eigenvalue — exactly the variance PCA maximises along its first component.
→ Taught in Quadratic Forms
Calculus & Optimization
Perform one step of gradient descent. The current weight is w = 10, the gradient of
the loss at this point is 10, and the learning rate is η = 0.1. What is the updated
weight?
The whole task is one substitution into the descent rule — subtract η times the
gradient.
w_new = w − η · (∂L/∂w)
= 10 − 0.1 × 10
= 10 − 1
= 9.0
Answer: 9.0. The minus sign is the trap: gradient descent subtracts the gradient (adding it is gradient ascent, which climbs the loss).
→ Taught in Gradient Descent (One Step)
Programming & DSA
Consider this function. What do f(1), then f(2), then f(3, []) return?
def f(val, lst=[]):
lst.append(val)
return lst
The default list is created once, when the def executes — not fresh on each call.
So every call that omits lst shares the same list, and it accumulates:
f(1)— appends1to the shared default →[1].f(2)— appends2to that same shared list, still holding1→[1, 2].f(3, [])— the caller passes its own fresh list, bypassing the default →[3].
Answer: [1], then [1, 2], then [3]. The jump from [1] to [1, 2] — with
nothing seeming to carry the 1 forward — is the mutable-default trap.
→ Taught in Functions, Scope & the Mutable-Default Trap
Databases & Warehousing
A relation R is decomposed into fragments. One functional dependency now has its
left-hand side in one fragment and its right-hand side in another. What is the
consequence for enforcing that dependency?
When a single FD straddles two fragments, neither piece holds enough columns to check it on its own. To validate the dependency on every insert or update you must join the fragments back together first — the decomposition is not dependency-preserving.
Answer: the FD can no longer be checked on a single fragment; you must reconstruct the original relation by joining the pieces to enforce it. That join cost on every update is exactly why dependency preservation is a design goal, not a nicety.
→ Taught in Decomposition & Lossless Joins
Machine Learning
(1) PCA reduces 100 dimensions to 10. What is the angle (in degrees) between principal components PC1 and PC10?
Principal components are the eigenvectors of a symmetric covariance matrix, so they are mutually orthonormal — every distinct pair is perpendicular. No computation is needed.
Answer: 90°.
(2) A fully-connected MLP has architecture 30 → 4 → 3 → 1 with NO bias terms. How
many learnable parameters does it have?
Without bias, a layer from a units to b units contributes a·b weights. Sum over
the three layer transitions:
weights = 30·4 + 4·3 + 3·1
= 120 + 12 + 3
= 135
Answer: 135. Read the bias condition carefully — a with-bias version of a similar net would add one parameter per output unit.
(3) As the regularization coefficient λ in ridge regression increases, how do bias
and variance change?
Larger λ squeezes the weights toward zero, making the model simpler and stiffer. A
stiffer model varies less from one training sample to the next (variance down) but
systematically misses structure (bias up).
Answer: bias increases, variance decreases.
(4) Points P1 = (2, 3, −1), P2 = (3, 1, 1), P3 = (5, −2, 3), P4 = (3, 3, 3).
Using Manhattan (L1) distance, which pair merges first in agglomerative clustering?
With every point its own cluster, the first merge is simply the closest pair. Compute the L1 distances:
d(P2, P4) = |3−3| + |1−3| + |1−3| = 0 + 2 + 2 = 4 ← smallest
d(P1, P2) = |2−3| + |3−1| + |−1−1| = 1 + 2 + 2 = 5
d(P1, P4) = |2−3| + |3−3| + |−1−3| = 1 + 0 + 4 = 5
d(P3, P4) = |5−3| + |−2−3| + |3−3| = 2 + 5 + 0 = 7
Answer: P2 and P4 merge first (distance 4).
→ Taught in PCA & Dimensionality Reduction, Multi-Layer Perceptron & Activations, Ridge Regression & Regularization, and Hierarchical Clustering & Linkage
Artificial Intelligence
(1) MAX has three strategies. Each leads to a MIN node with three leaf utilities:
strategy 1 → [8, 6, −1], strategy 2 → [1, 5, 7], strategy 3 → [−4, −3, −12].
Which strategy should MAX play?
Back up MIN at each strategy node (MIN takes the smallest leaf), then MAX picks the largest of those.
strategy 1: min(8, 6, −1) = −1
strategy 2: min(1, 5, 7) = 1
strategy 3: min(−4, −3, −12) = −12
V(root) = max(−1, 1, −12) = 1 → strategy 2
Answer: strategy 2 (value 1). Strategy 1 has the highest single leaf (8), but MIN
will never let MAX reach it — it steers to the −1.
(2) Which first-order-logic formula correctly captures “Every king is a person”?
Under a universal quantifier, the connective is implication, not conjunction.
∀x King(x) ⇒ Person(x)
The conjunction version ∀x King(x) ∧ Person(x) wrongly claims every object is both
a king and a person.
Answer: ∀x King(x) ⇒ Person(x).
(3) Over a non-empty domain, which quantifier-implications are valid (true under every
interpretation): (i) ∀x P(x) ⇒ ∃x P(x), (ii) ∃x P(x) ⇒ ∀x P(x), (iii)
∃x P(x) ⇔ ∀x P(x), (iv) ∀x P(x) ⇒ ∃x ¬P(x)?
Only (i) holds: if P is true for everything, pick any one object (the domain is
non-empty) to witness “there exists.” (ii) and (iii) fail when some but not all objects
satisfy P; (iv) fails precisely when ∀x P(x) is true, since then ∃x ¬P(x) is
false.
Answer: only (i) is valid.
(4) Which statement is equivalent to “X entails Y” (X ⊨ Y)?
Entailment means every model of X is a model of Y — equivalently, no assignment
makes X true and Y false at once.
X ⊨ Y ⇔ X ∧ ¬Y is unsatisfiable ⇔ X → Y is valid
Answer: X ⊨ Y if and only if X ∧ ¬Y is unsatisfiable.
→ Taught in Adversarial Search: Minimax, First-Order & Predicate Logic, and Propositional Logic