BuilderIntermediate

Open-Source LLMs for Enterprise

A 65-minute decision guide — Llama, Mistral, Phi, Qwen; Ollama, vLLM, and Azure ML; fine-tuning, cost, and sovereign deployment patterns.

8 chapters·Technical practitioner · Director·Free

Last updated: 2026-05-19

Get notified when chapters ship

What you'll learn

By the end of this course you'll be able to:

Why open-source — and the honest cases where a hosted API still wins
Model selection across Llama, Mistral, Phi, and Qwen families
Self-hosting paths: Ollama, vLLM, TGI, Azure ML private endpoints
Fine-tuning approaches — LoRA, QLoRA, full fine-tunes, and when each fits
Cost and performance tradeoffs at realistic enterprise concurrency
Air-gapped and sovereign deployment patterns for regulated environments

Who this is for

Senior engineers, ML platform leads, and tech directors evaluating or operating open-source LLMs at enterprise scale. Especially relevant for teams with sovereignty, residency, or per-token cost pressure — and for regulated sectors (BFSI, healthcare, government) where data simply cannot leave a controlled boundary.

Curriculum

8 chapters · 2 hands-on exercises · capstone challenge

Each chapter ends with the learning objectives ticked off. Quizzes are auto-graded with feedback; exercises are open-ended and produce artifacts you can take to your team.

1. Why open-source — and when not

Frame the open-source decision around sovereignty, cost curve, and roadmap risk
Recognize the 3 cases where a hosted API is still the right answer

2. Model selection — Llama, Mistral, Phi, Qwen

QUIZ

Compare model families on capability, licensing, and multilingual coverage
Match model size to task — and resist the “bigger is better” default

3. Self-hosting — Ollama, vLLM, TGI, Azure ML

EXERCISE

Pick the right serving stack for your latency, throughput, and ops profile
Configure vLLM and TGI for production concurrency, not laptop demos

4. Fine-tuning approaches

QUIZ

Choose between LoRA, QLoRA, and full fine-tuning based on data and goal
Avoid fine-tuning for problems that retrieval or prompting would solve cheaper

5. Cost and performance tradeoffs

Model the total cost of ownership — GPU hours, ops, evals, and incident load
Compare per-token economics against hosted APIs at your real concurrency

6. Air-gapped and sovereign deployment

EXERCISE

Architect deployments that satisfy data-residency and air-gap requirements
Apply the patterns that actually pass regulator and CISO review

7. Operational considerations

QUIZ

Plan for model upgrades, security patches, and quantization rollouts
Design observability across GPU, model, and request layers

Capstone: Capstone: Your open-source LLM decision pack

Document the model, serving, and fine-tuning decision for a real workload
Produce the TCO and sovereignty case your steering committee will accept

Capstone deliverable: Every learner who completes this course produces «Your Open-Source LLM Decision Pack» — a tangible artifact you take back to your organization.

Curriculum live · full chapter content rolling out through 2026.

The outline, learning objectives, references, and capstone deliverable are published. Full chapter content (video, narration, exercises) ships progressively. Get notified when each chapter goes live.

Get notified when chapters ship

References & sources

Built on cited sources — not vibes.

Every course is researched fresh against vendor documentation, regulatory sources, and peer-reviewed work. Sources used in this course:

Meta Llama — Models and Licensing

Meta · Source link

Mistral AI Documentation

Mistral AI · Source link

vLLM Documentation

vLLM Project · Source link

Ollama — Local Model Runtime

Ollama · Source link

Hugging Face Documentation

Hugging Face · Source link

Course details

Track

Builder

Level

Intermediate

Audience

Technical practitioner, Director

Function

IT & Engineering

Industry

Cross-Industry

Stack

Open-source

Paired Gennoor Way phase

innovate, build

Format

video, hands-on, interactive

You finished the course. Now what?

From course to outcome.

Reading this course is step one. The next step is applying it where you work. Here's how Gennoor helps — without the deck, without the pitch.

Run this for your team

A 2-day workshop or virtual cohort for up to 25 of your people, with exercises run on your data and a 30-day adoption plan.

From $5k · 2 weeks · function-specific

Talk to us about a workshop

Apply this to your data

A 4–6 week pilot that takes what you learned and ships a working system inside your environment. Fixed scope, fixed price, code transferred day one.

From $25k · 6 weeks · production-grade

Scope a pilot

Just want to talk?

Free 30-minute call. No deck, no pitch. We listen to your situation and tell you honestly what makes sense — even if it isn't us.

Free · no commitment · 30 minutes

Book a call

Or just keep learning. We recommend next:

builder

RAG Architectures — Foundations

60 min

builder

MLOps for LLMs

65 min

builder

Azure AI Foundry Essentials

55 min

Just finished «Open-Source LLMs for Enterprise». Want this to go further at your organization?

Back to all 48 Academy courses