BuilderIntermediate

RAG Architectures — Foundations

A 60-minute engineer’s tour of retrieval-augmented generation — embeddings, vector stores, hybrid search, re-ranking, and the failure modes nobody puts in the slides.

8 chapters·Technical practitioner·Free

Last updated: 2026-05-19

Get notified when chapters ship

What you'll learn

By the end of this course you'll be able to:

When RAG is the right pattern — and the three cases where it isn’t
How embeddings actually work, and why “more tokens” isn’t always better
Vector stores compared — pgvector, Pinecone, Azure AI Search, Weaviate
Hybrid search done right: BM25 + vectors + semantic ranking
Re-ranking strategies (cross-encoders, LLM rerankers, hybrid) and their cost curves
Evaluation — retrieval quality vs answer quality, and how to measure both

Who this is for

Backend engineers, ML engineers, and applied scientists building retrieval-augmented systems. Especially valuable for engineers shipping their first production RAG system — or inheriting a flaky one and trying to figure out why retrieval quality looks great on the slide deck and terrible on real user queries.

Curriculum

8 chapters · 2 hands-on exercises · capstone challenge

Each chapter ends with the learning objectives ticked off. Quizzes are auto-graded with feedback; exercises are open-ended and produce artifacts you can take to your team.

1. Why RAG — and when not to use it

Distinguish RAG from long-context, fine-tuning, and tool-use approaches
Recognize the 3 use cases where RAG is the wrong architecture

2. Embeddings explained

QUIZ

Build a working mental model of dense vector representations
Pick embedding models with the right context length, dimension, and language coverage

3. Vector stores

EXERCISE

Compare pgvector, Pinecone, Azure AI Search, and Weaviate on latency, cost, and ops
Choose the right store for your scale and operational footprint

4. Hybrid search

QUIZ

Combine BM25 keyword search with vector search and semantic ranking
Tune fusion strategies (RRF, weighted) for your query distribution

5. Re-ranking strategies

Apply cross-encoder and LLM-based rerankers where they earn their cost
Decide when re-ranking adds quality vs only adds latency

6. Evaluation — retrieval and answer quality

EXERCISE

Measure retrieval quality with hit rate, MRR, and recall@k
Measure answer quality with groundedness, faithfulness, and answer relevance

7. Common RAG failures and fixes

QUIZ

Diagnose the 6 most common RAG failure modes from query traces
Apply targeted fixes — chunking, query rewriting, metadata filters

Capstone: Capstone: Ship an evaluated RAG pipeline

Design an end-to-end RAG pipeline with hybrid retrieval and re-ranking
Document the eval harness and acceptance criteria for production cutover

Capstone deliverable: Every learner who completes this course produces «Your Evaluated RAG Pipeline» — a tangible artifact you take back to your organization.

Curriculum live · full chapter content rolling out through 2026.

The outline, learning objectives, references, and capstone deliverable are published. Full chapter content (video, narration, exercises) ships progressively. Get notified when each chapter goes live.

Get notified when chapters ship

References & sources

Built on cited sources — not vibes.

Every course is researched fresh against vendor documentation, regulatory sources, and peer-reviewed work. Sources used in this course:

Azure AI Search Documentation

Microsoft Learn · Source link

Pinecone Learning Hub — RAG Patterns

Pinecone · Source link

Azure AI Foundry — RAG Reference Architectures

Microsoft Learn · Source link

Hugging Face — Embedding Models

Hugging Face · Source link

OWASP Top 10 for LLM Applications

OWASP Foundation · Source link

Course details

Track

Builder

Level

Intermediate

Audience

Technical practitioner

Function

IT & Engineering

Industry

Cross-Industry

Stack

Microsoft, Open-source, Stack-agnostic

Paired Gennoor Way phase

innovate, build

Format

video, hands-on, interactive

You finished the course. Now what?

From course to outcome.

Reading this course is step one. The next step is applying it where you work. Here's how Gennoor helps — without the deck, without the pitch.

Run this for your team

A 2-day workshop or virtual cohort for up to 25 of your people, with exercises run on your data and a 30-day adoption plan.

From $5k · 2 weeks · function-specific

Talk to us about a workshop

Apply this to your data

A 4–6 week pilot that takes what you learned and ships a working system inside your environment. Fixed scope, fixed price, code transferred day one.

From $25k · 6 weeks · production-grade

Scope a pilot

Just want to talk?

Free 30-minute call. No deck, no pitch. We listen to your situation and tell you honestly what makes sense — even if it isn't us.

Free · no commitment · 30 minutes

Book a call

Or just keep learning. We recommend next:

builder

Azure AI Foundry Essentials

55 min

builder

MLOps for LLMs

65 min

Just finished «RAG Architectures — Foundations». Want this to go further at your organization?

Back to all 48 Academy courses