Q: Walk me through deploying a model to production.
Train → validate → register → deploy to staging → A/B test → promote to prod → monitor.
Q: How do you handle model failures in production?
Fallback to previous version. Alert on-call. Investigate root cause. Have rollback automation.
Q: What's the hardest production ML problem you've faced?
Prepare a real story. Include: problem, investigation, solution, what you learned.
Q: How do you decide when to retrain?
Monitor drift and performance. Retrain on schedule or when metrics drop. Balance freshness vs stability.