Week 04: RAG Pipelines — Basics to Agentic
What You'll Learn
RAG is table stakes. Agentic RAG — where retrieval triggers actions, multi-hop reasoning, and re-ranking — is what production systems actually need.
Session Schedule
| Day | Time | Focus |
|---|---|---|
| Saturday | 8:00 - 11:00 PM WAT | RAG Architecture & Retrieval |
| Sunday | 8:00 - 11:00 PM WAT | Hybrid Search & Evaluation Build |
Pre-Requisites
- Weeks 01-03 completed
- PGVector running in Docker
- Understanding of embeddings and vector search
Topics Covered
Chunking Strategies & Embedding Models
Fixed-size, semantic, recursive chunking, OpenAI vs HuggingFace embeddings. How you chunk determines how well you retrieve — get this wrong and nothing downstream matters.
Chunking Semantic Split EmbeddingsHybrid Search: Dense + Sparse
BM25 for keyword, vector for semantic, Reciprocal Rank Fusion. Why neither approach works alone and how to combine them for production-grade retrieval.
BM25 Vector Search RRFRe-ranking & MMR
Cross-encoder re-ranking, Maximal Marginal Relevance, diversity vs relevance. The post-retrieval step that separates good RAG from great RAG.
Re-ranking MMR Cross-EncoderMulti-hop Retrieval
Query decomposition, iterative retrieval, chain-of-retrieval patterns. When one retrieval pass isn't enough to answer complex questions.
Multi-hop Query Decomposition IterativeEvaluation with RAGAS
Faithfulness, answer relevancy, context precision, context recall metrics. You can't improve what you can't measure — build evaluation into your pipeline from day one.
RAGAS Faithfulness PrecisionWeekly Build: Enterprise Document Q&A
Build a full RAG pipeline with 7 layers: ingestion, chunking, embedding, storage, query understanding, access control, and hybrid search.
Architecture
Documents (PDF/DOCX/TXT)
|
v
Layer 1: Ingestion (document_loader.py)
|
v
Layer 2: Chunking (semantic + recursive)
|
v
Layer 3: Embedding (OpenAI text-embedding-3-small)
|
v
Layer 4: Storage (PGVector)
|
v
User Query → Layer 5: Query Understanding (expansion + intent)
|
v
Layer 6: Access Control (RBAC tier filter)
|
v
Layer 7: Hybrid Search (BM25 + Vector + RRF)
|
v
Grounded LLM Answer + P@5 Score
Key Files
| File | Purpose |
|---|---|
rag/pipeline.py | Main pipeline orchestration |
rag/ingestion/ | Document loaders (PDF, DOCX, TXT) |
rag/chunking.py | Semantic & recursive chunking |
rag/embeddings.py | Embedding generation |
rag/vector_store.py | PGVector CRUD operations |
rag/query_understanding.py | Query expansion & intent detection |
rag/access_control.py | RBAC tier filtering |
rag/hybrid_search.py | BM25 + Vector + RRF fusion |
rag/evaluation.py | RAGAS metrics & scoring |
Resources
Required Reading
- LangChain RAG documentation
- RAGAS evaluation framework docs
- "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks" paper
Code Repository
Switch to the week-04 branch:
git checkout week-04
Session Recording
Recording will be available within 24 hours after the live session. Check the WhatsApp group for the link.
Homework
Due before CORTEX project kickoff session.
- Complete the enterprise Q&A build — push your code to the bootcamp repo
- Run RAGAS evaluation — generate faithfulness, relevancy, and precision scores on at least 20 test questions
- Compare chunking strategies — benchmark fixed-size vs semantic vs recursive on the same document set
- Write a 1-page reflection on retrieval quality — what surprised you about how chunking affects answers? Share in the WhatsApp group