Week 04 — RAG Pipelines | SoftBricks Academy

What You'll Learn

RAG is table stakes. Agentic RAG — where retrieval triggers actions, multi-hop reasoning, and re-ranking — is what production systems actually need.

Session Schedule

Day	Time	Focus
Saturday	8:00 - 11:00 PM WAT	RAG Architecture & Retrieval
Sunday	8:00 - 11:00 PM WAT	Hybrid Search & Evaluation Build

Pre-Requisites

Weeks 01-03 completed
PGVector running in Docker
Understanding of embeddings and vector search

Topics Covered

Chunking Strategies & Embedding Models

Fixed-size, semantic, recursive chunking, OpenAI vs HuggingFace embeddings. How you chunk determines how well you retrieve — get this wrong and nothing downstream matters.

Chunking Semantic Split Embeddings

Hybrid Search: Dense + Sparse

BM25 for keyword, vector for semantic, Reciprocal Rank Fusion. Why neither approach works alone and how to combine them for production-grade retrieval.

BM25 Vector Search RRF

Re-ranking & MMR

Cross-encoder re-ranking, Maximal Marginal Relevance, diversity vs relevance. The post-retrieval step that separates good RAG from great RAG.

Re-ranking MMR Cross-Encoder

Multi-hop Retrieval

Query decomposition, iterative retrieval, chain-of-retrieval patterns. When one retrieval pass isn't enough to answer complex questions.

Multi-hop Query Decomposition Iterative

Evaluation with RAGAS

Faithfulness, answer relevancy, context precision, context recall metrics. You can't improve what you can't measure — build evaluation into your pipeline from day one.

RAGAS Faithfulness Precision

Weekly Build: Enterprise Document Q&A

Build a full RAG pipeline with 7 layers: ingestion, chunking, embedding, storage, query understanding, access control, and hybrid search.

Architecture

Documents (PDF/DOCX/TXT)
    |
    v
Layer 1: Ingestion (document_loader.py)
    |
    v
Layer 2: Chunking (semantic + recursive)
    |
    v
Layer 3: Embedding (OpenAI text-embedding-3-small)
    |
    v
Layer 4: Storage (PGVector)
    |
    v
User Query → Layer 5: Query Understanding (expansion + intent)
    |
    v
Layer 6: Access Control (RBAC tier filter)
    |
    v
Layer 7: Hybrid Search (BM25 + Vector + RRF)
    |
    v
Grounded LLM Answer + P@5 Score

Key Files

File	Purpose
`rag/pipeline.py`	Main pipeline orchestration
`rag/ingestion/`	Document loaders (PDF, DOCX, TXT)
`rag/chunking.py`	Semantic & recursive chunking
`rag/embeddings.py`	Embedding generation
`rag/vector_store.py`	PGVector CRUD operations
`rag/query_understanding.py`	Query expansion & intent detection
`rag/access_control.py`	RBAC tier filtering
`rag/hybrid_search.py`	BM25 + Vector + RRF fusion
`rag/evaluation.py`	RAGAS metrics & scoring

Resources

Required Reading

LangChain RAG documentation
RAGAS evaluation framework docs
"Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks" paper

Code Repository

Switch to the week-04 branch:

git checkout week-04

Session Recording

Recording will be available within 24 hours after the live session. Check the WhatsApp group for the link.

Homework

Due before CORTEX project kickoff session.

Complete the enterprise Q&A build — push your code to the bootcamp repo
Run RAGAS evaluation — generate faithfulness, relevancy, and precision scores on at least 20 test questions
Compare chunking strategies — benchmark fixed-size vs semantic vs recursive on the same document set
Write a 1-page reflection on retrieval quality — what surprised you about how chunking affects answers? Share in the WhatsApp group

Week 04: RAG Pipelines — Basics to Agentic