What are MAP and NDCG, and when would you use each for evaluating a ranking system?
MAP (Mean Average Precision) is the mean across queries of the area under the precision-recall curve, computed only at positions where relevant items appear. NDCG (Normalized Discounted Cumulative Gain) accounts for graded relevance and position discount — a relevant item at rank 1 is worth more than one at rank 10. Use MAP when relevance is binary and every relevant result matters equally; use NDCG when items have graded relevance or when top-of-list quality is more important than tail coverage.
How to think about it
Define both metrics from first principles, contrast their assumptions, and give a concrete worked example.
MAP: Mean Average Precision
Average Precision (AP) for a single query measures how well the ranking surfaces relevant items, averaged over recall levels.
For a query with relevant items at ranks r1, r2, …, rk out of N retrieved:
AP = (1 / R) * sum_{i=1}^{k} Precision@r_i
Where R is the total number of relevant items (including those not retrieved), and Precision@r_i is the precision at the rank where the i-th relevant item appears.
MAP = average of AP across all queries.
Example. Ranked list for a query with 3 relevant items: [R, N, R, N, N, R] (R = relevant, N = not).
- Relevant at position 1: Precision@1 = 1/1 = 1.0
- Relevant at position 3: Precision@3 = 2/3 = 0.67
- Relevant at position 6: Precision@6 = 3/6 = 0.5
AP = (1.0 + 0.67 + 0.5) / 3 = 0.72
NDCG: Normalized Discounted Cumulative Gain
NDCG handles graded relevance (e.g., scores 0, 1, 2, 3) and applies a logarithmic position discount, reflecting that users rarely scroll past the first few results.
DCG@k = sum_{i=1}^{k} (2^rel_i - 1) / log2(i + 1)
NDCG@k = DCG@k / IDCG@k
Where IDCG is the DCG of the ideal ranking (all relevant items first, sorted by grade). NDCG = 1 is a perfect ranking.
Example. Relevance grades: [3, 2, 3, 0, 1, 2] at positions 1–6.
| Position | Grade | Discount 1/log2(i+1) | Contribution |
|---|---|---|---|
| 1 | 3 | 1.000 | 7.00 |
| 2 | 2 | 0.631 | 1.89 |
| 3 | 3 | 0.500 | 3.50 |
| 4 | 0 | 0.431 | 0 |
| 5 | 1 | 0.387 | 0.39 |
| 6 | 2 | 0.356 | 1.07 |
DCG@6 = 13.85. If the ideal order is [3, 3, 2, 2, 1, 0], IDCG@6 = 14.60. NDCG@6 = 0.948.
When to use each
| Situation | Prefer |
|---|---|
| Binary relevance, every relevant item equally important | MAP |
| Graded relevance (star ratings, engagement scores) | NDCG |
| Top-k ranking quality more important than tail | NDCG@k (small k) |
| Information retrieval with recall importance | MAP |
| Recommendation system, search ranking | NDCG@10 or NDCG@5 |