RAG Architectures — Foundations
A ~36-minute engineer's tour of retrieval-augmented generation. Embeddings, vector stores, hybrid search, re-ranking, evaluation, 6 failure modes, 5-component production architecture. Voice: Emma.
8
Chapters
~36 min
Duration
Intermediate
Level
No
Certification
Who this is for
For backend engineers, ML engineers, applied scientists, and AI architects building retrieval-augmented systems.
How this course works
- 8 audio-narrated slide chapters · ~36 min of focused content
- Narrated by Emma (Azure neural voice)
- Capstone with interactive Markdown builder you take to your team
- Trust trip-wires on every play — what not to cross
- Free verifiable certificate on completion
What you'll walk out with
Specific outcomes from this course — no fluff.
- When RAG fits — large/changing KB · citations + grounding · long-tail facts — and 3 cases when it's the wrong architecture
- Embeddings as dense vectors — what they encode well (topic, paraphrase, semantic relations) and miss (IDs, negation, numerics, recency)
- 4 vector stores compared — pgvector, Pinecone, Azure AI Search, Weaviate — with the scale guide and cost traps to avoid
- Hybrid search (BM25 + vector + semantic ranking) with RRF/weighted/cascade fusion — where retrieval quality actually lives
- 3 re-ranker types and the cost-quality curve — first 50 candidates lift 15-25%; beyond that, diminishing returns
- Evaluation in two layers — retrieval (hit rate, MRR, recall@k) vs answer (RAGAS: groundedness, faithfulness, relevance) — never conflated
- 6 production failure modes (bad chunking, query intent mismatch, OOD queries, context overflow, metadata blind spots, stale knowledge) + targeted fixes
- 5-component production architecture + 4 trust trip-wires + interactive architecture-decision-record builder
Course content
8 chapters · ~36 min
Welcome
A 1-minute orientation — what the course covers, how to navigate, and what you walk out with. No audio on this screen.
Why RAG and when not to use it
When RAG fits (large/changing KB · citations · long-tail facts) · 3 cases when it's wrong · RAG vs long-context vs fine-tuning.
Embeddings explained
What they are · what they encode well (topic · paraphrase · semantic) · what they miss (IDs · negation · numerics · recency) · 4 model-selection factors.
Vector stores
4 stores compared (pgvector · Pinecone · Azure AI Search · Weaviate) · scale guide · 3 evaluation dimensions · cost traps.
Hybrid search
Why pure vector fails · 3 fusion strategies (RRF · weighted · cascade) · semantic ranking as third layer · honest scoring.
Re-ranking strategies
Why re-ranking · 3 types (cross-encoder · managed API · LLM) · cost-quality curve · re-rank a small top-N.
Evaluation
2 distinct layers (retrieval vs answer) · 3 retrieval metrics (hit · MRR · recall@k) · 3 answer metrics (RAGAS) · 3 eval-set options.
Common RAG failures and fixes
6 production failure modes (chunking · query intent · OOD · context overflow · metadata blind spots · stale knowledge) with targeted fixes.
Capstone — production RAG
5-component production architecture · 4 trust trip-wires · interactive architecture-decision-record builder.
Want this delivered inside your organisation?
The course is the starting point. The same content powers a 4-week pilot, an org-wide rollout, or a continuous build engagement — set up on your data, with your team, by Gennoor Tech.