datarekha
MLOps Medium Asked at NetflixAsked at UberAsked at LinkedInAsked at Airbnb

What is the difference between data drift, concept drift, and label drift — and how do you detect each?

The short answer

Data drift is a change in the statistical distribution of model inputs; concept drift is a change in the relationship between inputs and the target; label drift is a shift in the marginal distribution of the target itself. They require different detectors and carry different business urgency.

How to think about it

All three are real threats, but only concept drift directly breaks predictions. Knowing which you have determines what you do next.

Data drift (covariate shift)

The input distribution P(X) changes — for example, your fraud model starts receiving more transactions from a new geography it rarely saw in training. The mapping from X to Y has not changed; the model would still be correct if it saw those inputs during training.

Detection: Statistical tests on input features — Population Stability Index (PSI) greater than 0.2 flags serious drift; the Kolmogorov-Smirnov (KS) test on continuous features; chi-squared on categoricals. Track per-feature summary statistics (mean, p50, p95, null rate) in a time-series store and alert on deviations beyond 3 standard deviations from a rolling baseline.

Concept drift (posterior shift)

The mapping P(Y | X) changes — users’ definition of a “good recommendation” evolves after a product change. Your inputs look similar but the right answer has shifted. This is the one that silently destroys accuracy.

Detection: Requires labels. When labels arrive (even with delay), compute rolling accuracy, AUC, or F1 on a labeled window. Without labels, use proxy signals — click-through rate, downstream business KPIs, or disagreement between model versions on the same inputs.

Label drift (prior shift)

The target marginal P(Y) changes — positive-class prevalence doubles because of a promotion. Threshold-based decision rules break first.

Detection: Monitor predicted score distributions (prediction drift) as a leading indicator; histogram distance between recent predictions and training-time label frequencies.

Data driftP₂(X) trainP₁(X) serveConcept driftboundaryⁿboundary₁same X, shifted Y|X boundary
Data drift shifts input mass; concept drift moves the decision boundary even when inputs look similar.

Keep practising

All MLOps questions

Explore further

Skip to content