When and how should you trigger model retraining — scheduled vs. event-driven?
Scheduled retraining is simple and predictable but wastes compute when nothing has shifted and reacts slowly when drift is sudden. Event-driven retraining ties compute to evidence — a drift alarm, a performance threshold breach, or a data volume trigger — and is more efficient at scale. Most mature systems combine both.
How to think about it
The retraining question is really two questions: when to retrain, and how to validate the new model before it goes live. Both matter equally.
Scheduled retraining
Train nightly, weekly, or monthly on a fixed cadence regardless of drift signals. Simple to implement and easy to audit. Works well for models where the world changes slowly and training is cheap.
Downsides: wastes compute during stable periods; reacts with lag when drift is abrupt (breaking news, a competitor launch, a market shock). A weekly cadence can lose days of accuracy.
Event-driven (triggered) retraining
Retrain when a monitored condition is met:
- Drift trigger: PSI above 0.2 on a key feature, or Jensen-Shannon divergence on output distribution exceeding a threshold.
- Performance trigger: rolling accuracy or AUC (when labels arrive) drops below an acceptable floor.
- Data volume trigger: enough new labelled samples have accumulated to meaningfully shift the training distribution.
- Business trigger: an external event (product launch, seasonal spike, regulation change) is flagged by a human operator.
Event-driven requires robust monitoring infrastructure — if your drift detectors are noisy, you’ll thrash with unnecessary retrains.
Hybrid approach (production best practice)
Use a minimum scheduled cadence (e.g., monthly) to prevent stale models, plus event-driven triggers that can fire sooner. This gives a safety net against runaway drift while keeping compute proportional to need.
Validating the retrained model before deployment
- Run on a held-out recent window (not the same window used to detect drift).
- Shadow deploy: route live traffic to both models, compare predictions without serving the new model’s output.
- Canary or A/B test: serve a small traffic slice to the new model, gate on business KPI improvement before full rollout.
- Automated champion/challenger: promote only if the challenger exceeds the champion on the evaluation metric by a statistically significant margin.