What's the difference between experiment tracking and a model registry, and why do you need both?
Experiment tracking logs every run, its parameters, metrics, and artifacts, so you can compare and reproduce experiments during development. A model registry is the curated, governed catalog of the few models you actually intend to deploy, with versioning, stage or alias management, approvals, and lineage. You need both because tracking gives breadth for exploration while the registry gives the controlled, auditable path to production.
How to think about it
The short answer
Experiment tracking records every run — params, metrics, artifacts — so you can compare and reproduce during R&D. A model registry is the curated, governed catalog of the few models you actually promote, with versioning, alias/stage management, approvals, and lineage. Tracking is breadth for exploration; the registry is the controlled path to production.
Why both
During development you might run hundreds of experiments. You want all of them logged so you can answer “which hyperparameters won?” That’s tracking (MLflow Tracking, Weights & Biases). But you do not want hundreds of candidates fighting for production — you want a small, vetted set with clear ownership, approval state, and a deployable reference. That’s the registry. As the MLflow docs describe, the registry adds versioning, aliases, and collaboration on top of logged runs.
Concrete example
A data scientist logs 200 runs to the tracking server. The best one, run abc123, is then registered as fraud-model v23. Now serving infra references models:/fraud-model@champion, ops can see who approved it, and you can trace v23 back to run abc123, its data hash, and git SHA. The other 199 runs stay in tracking history but never touch prod.
How they connect
The link is the run ID: a registered model version points back to the exact tracking run that produced it. That’s what gives you end-to-end lineage from a deployed prediction all the way to the experiment, data, and code.
Common follow-up / trap
Interviewers ask: “Can’t you just use tracking and pick the best run at deploy time?” You can for a toy project, but you lose governance — no approval gates, no stable alias for rollback, no clear audit of what’s live. The trap is treating them as interchangeable. The crisp framing: tracking optimizes for discovery, the registry optimizes for control and accountability.