MLOps for LLMs
A ~36-minute advanced course for ML and platform engineers. 5 LLM-vs-classical-ML differences, 3 drift types, 4 cost patterns, 60-sec rollback discipline, 4 incident types, the eval-case-it post-mortem. Voice: Andrew.
8
Chapters
~36 min
Duration
Advanced
Level
No
Certification
Who this is for
For ML engineers, platform engineers, SREs, and on-call leads operating LLM systems in production.
How this course works
- 8 audio-narrated slide chapters · ~36 min of focused content
- Narrated by Andrew (Azure neural voice)
- Capstone with interactive Markdown builder you take to your team
- Trust trip-wires on every play — what not to cross
- Free verifiable certificate on completion
What you'll walk out with
Specific outcomes from this course — no fluff.
- Five differences that make LLMOps its own discipline — non-determinism, generative outputs, third-party black boxes, per-request cost variability, adversarial users
- 3-tier production evaluation (light + targeted + gold sampling), LLM-as-judge done right (different family, structured rubric, human on disagreement), PII discipline
- Three distinct drift types — input, output, behavioral — with regression-set replay to catch silent vendor model updates
- Four cost optimization patterns (compress + cache + route + smaller-plus-retrieval) and three anti-patterns that break quality or correctness
- One-artifact versioning (prompt + model + eval) with shadow/canary/feature-flag rollouts and 60-second rollback as a tested capability
- OpenTelemetry GenAI semantic conventions + MLflow vs LangSmith for trace search, replay, and eval linking
- Four incident types (quality, drift, abuse, vendor outage) with 5-question triage and the eval-case-it post-mortem that produces durable regression tests
- 5-section LLMOps runbook + 4 trust trip-wires + interactive runbook builder for the platform team
Course content
8 chapters · ~36 min
Welcome
A 1-minute orientation — what the course covers, how to navigate, and what you walk out with. No audio on this screen.
LLMOps vs MLOps
5 differences (non-determinism · generative · third-party · cost variability · adversarial) · what carries · what doesn't.
Model evaluation in production
3-tier sampling (light · targeted · gold) · LLM-as-judge done right · PII discipline for regulated contexts.
Drift detection
3 drift types (input · output · behavioral) · regression-set replay · actionable vs cosmetic alerts.
Cost optimization patterns
The cost surprise · 4 patterns that work (compress · cache · route · smaller+retrieval) · 3 anti-patterns · cost telemetry.
Versioning prompts and models together
One-artifact discipline (prompt + model + eval triple) · 3 rollout patterns (shadow · canary · feature flag) · 60-sec rollback.
Observability with traces
OpenTelemetry GenAI conventions · MLflow vs LangSmith · A/B testing on traces.
Incident response for LLM failures
4 incident types (quality · drift · abuse · vendor outage) · 5-question triage · the eval-case-it post-mortem.
Capstone — your LLMOps runbook
5-section runbook · 4 trust trip-wires · interactive runbook builder for your platform team.
Want this delivered inside your organisation?
The course is the starting point. The same content powers a 4-week pilot, an org-wide rollout, or a continuous build engagement — set up on your data, with your team, by Gennoor Tech.