Interview prep
MLOps interview questions
23 of the most common MLOps questions for data and AI interviews — each with a worked answer, the trap to avoid, and a link to learn it properly. Serving, monitoring, drift, CI/CD, reproducibility.
Filter by role
- What does experiment tracking solve, and how do MLflow and Weights and Biases differ in practice? Easy ·OpenAI·Cohere·Databricks
- Walk me through the full ML lifecycle from problem definition to model retirement. Easy ·Google·Amazon·Microsoft
- How does autoscaling work for ML inference services, and what metrics should drive it? Medium ·Google·Amazon·Uber
- What are the differences between batch, online, and streaming inference, and when should you use each? Medium ·Netflix·Uber·LinkedIn
- How does CI/CD for ML differ from standard software CI/CD, and what stages should an ML pipeline include? Medium ·Google·Spotify·Airbnb
- What is the difference between data drift, concept drift, and label drift — and how do you detect each? Medium ·Netflix·Uber·LinkedIn
- How do Docker and ONNX complement each other for packaging and deploying ML models portably? Medium ·Microsoft·Amazon·Meta
- What is a model registry, and how does model versioning work in production ML systems? Medium ·Databricks·Netflix·Airbnb
- How do you safely roll back a model in production and what triggers a rollback? Medium ·Netflix·Lyft·Twitter
- What are the security and compatibility risks of using pickle for model serialization, and what are the safer alternatives? Medium ·Amazon·Google·Hugging Face
- What metrics should you monitor for a production ML model, and at what layer? Medium ·Uber·Lyft·Spotify
- How do you achieve reproducibility in ML training pipelines — covering seeds, environment, and data versioning? Medium ·DeepMind·OpenAI·Meta
- When would you choose gRPC over REST for model serving, and what are the practical trade-offs? Medium ·Google·Netflix·Lyft
- What is the difference between shadow deployment and canary deployment for ML models, and when do you use each? Medium ·Google·Netflix·Airbnb
- What is a feature store and why is it critical for production ML systems? Medium ·Uber·LinkedIn·Airbnb
- When and how should you trigger model retraining — scheduled vs. event-driven? Medium ·Airbnb·DoorDash·Netflix
- Why does a model that performed well in offline evaluation degrade in production? Medium ·Meta·Google·Stripe
- How do you optimise GPU utilization for model serving, and what role does dynamic batching play? Hard ·Nvidia·Google·OpenAI
- How do you monitor a model when ground-truth labels are delayed or never arrive? Hard ·Stripe·Affirm·Klarna
- How do you balance latency and throughput trade-offs when designing a model serving system? Hard ·Google·Amazon·OpenAI
- How does LLMOps differ from classical MLOps, and what new operational challenges do LLMs introduce? Hard ·OpenAI·Anthropic·Cohere
- A model is live and you cannot get labels quickly. How do you set up alerting to catch performance problems early? Hard ·Stripe·Affirm·Uber
- What is train/serve skew and how do you prevent it? Hard ·Google·Meta·Stripe
No questions tagged for that role yet.