Open-Source LLMs for Enterprise
A ~36-minute decision guide for senior engineers and tech directors evaluating or operating open-source LLMs at enterprise scale. Voice: Andrew.
8
Chapters
~36 min
Duration
Intermediate
Level
No
Certification
Who this is for
For senior engineers, ML platform leads, and tech directors evaluating or operating open-source LLMs at scale.
How this course works
- 8 audio-narrated slide chapters · ~36 min of focused content
- Narrated by Andrew (Azure neural voice)
- Capstone with interactive Markdown builder you take to your team
- Trust trip-wires on every play — what not to cross
- Free verifiable certificate on completion
What you'll walk out with
Specific outcomes from this course — no fluff.
- Open-source isn't free — it's a different cost curve. Three real reasons (sovereignty + residency, cost at scale, roadmap control) and 3 cases hosted API still wins
- Four model families compared (Llama, Mistral, Phi, Qwen) with the license traps that catch teams and the size discipline that resists "bigger is better"
- Self-hosting with Ollama vs vLLM vs TGI vs Azure ML — the gap between demo-grade benchmarks and production-grade concurrency
- Fine-tuning approaches (LoRA · QLoRA · full) and the "try retrieval first" discipline — when fine-tuning is the wrong shortcut
- 5-component TCO with honest breakeven math — real breakeven is 20-50M tokens/day, not the headline GPU price
- Three sovereign deployment tiers and what regulators + CISOs actually accept across India, GCC, US, EU
- Operational discipline across model upgrades, security patches, and 3-layer observability (GPU · model · request)
- A 4-section decision pack + 4 trust trip-wires + interactive decision-pack builder for the steering committee
Course content
8 chapters · ~36 min
Welcome
A 1-minute orientation — what the course covers, how to navigate, and what you walk out with. No audio on this screen.
Why open-source and when not
3 real reasons (sovereignty + cost-at-scale + roadmap control) · 3 cases hosted API wins · honest framing.
Model selection
4 families (Llama · Mistral · Phi · Qwen) · 3 license traps · size discipline · match-to-task.
Self-hosting
4 serving stacks · production-grade vs demo-grade · vLLM sizing (memory · quant · concurrency · P99).
Fine-tuning approaches
Don't-fine-tune trap (retrieval first) · 3 approaches (LoRA · QLoRA · full) · when each fits · when not at all.
Cost and performance tradeoffs
5-component TCO · breakeven math · real-concurrency reality (25-50% of benchmark).
Air-gapped and sovereign deployment
3 deployment tiers · 4 architectural requirements · regional regulator acceptance (India · GCC · US · EU).
Operational considerations
3 upgrade types (model · serving stack · quantization) · 3 observability layers (GPU · model · request) · patch discipline.
Capstone — OSS LLM decision pack
4-section decision pack · 4 trust trip-wires · interactive decision-pack builder for steering committee.
Want this delivered inside your organisation?
The course is the starting point. The same content powers a 4-week pilot, an org-wide rollout, or a continuous build engagement — set up on your data, with your team, by Gennoor Tech.