MLOps Medium Asked at GoogleAsked at NetflixAsked at AirbnbAsked at LinkedInAsked at Meta

What is the difference between shadow deployment and canary deployment for ML models, and when do you use each?

For MLOps Engineer ML Engineer AI / LLM Engineer

The short answer

Shadow deployment mirrors live traffic to the new model and discards its predictions, so you can evaluate performance and load without any user impact. Canary deployment routes a small real slice of traffic to the new model and uses its predictions, so real user impact is possible but limited and monitored.

How to think about it

Shadow deployment duplicates every incoming request to both the current champion model and the new challenger. The challenger’s response is logged but never returned to the user. This lets you measure prediction distributions, latency, error rates, and resource consumption under real production load with zero risk to users. It is the standard step before a canary when the new model is a significant change — different architecture, different feature set, or first deployment of any model.

Canary deployment routes a small percentage of real traffic (typically 1–5 %) to the new model and serves its predictions to actual users. The remaining traffic continues to the champion. Business metrics (click-through, conversion, revenue) and technical metrics (p99 latency, error rate) are monitored on both slices. If the canary slice degrades, traffic is instantly shifted back to 0 % without a code change.

Shadow mirrors traffic silently; canary routes a real slice and affects real users.

Typical promotion path: Shadow → Canary (1 %) → Canary (10 %) → Full rollout.

# Argo Rollouts canary strategy
strategy:
  canary:
    steps:
      - setWeight: 5
      - pause: {duration: 30m}
      - setWeight: 25
      - pause: {duration: 1h}
      - setWeight: 100
    analysis:
      templates:
        - templateName: model-error-rate
      args:
        - name: threshold
          value: "0.01"

What is the difference between shadow deployment and canary deployment for ML models, and when do you use each?

Keep practising

Explore further