Skip to content

Agenda — Advanced RAG

Basic RAG — chunk, embed, retrieve, generate — gets you to 70% quality. The remaining 30% comes from addressing specific failure modes: the retrieved chunks are irrelevant, the query doesn't match the embedding space of the documents, or the context window is filled with partially-relevant text. Advanced RAG techniques fix each of these.

Learning objectives

By the end of this session you will be able to:

  • Apply cross-encoder reranking to improve retrieval precision
  • Use HyDE to bridge the query-document embedding gap
  • Implement multi-query retrieval for better recall on complex questions
  • Apply contextual compression to reduce noise in retrieved context

Schedule

Time Topic File
0:00 – 0:30 Reranking — cross-encoders and Cohere Rerank 01-reranking
0:30 – 1:00 HyDE — hypothetical document embeddings 02-hyde
1:00 – 1:30 Multi-query retrieval 03-multi-query
1:30 – 2:00 Contextual compression 04-contextual-compression
2:00 – 2:30 Advanced RAG patterns combined 05-advanced-rag-patterns
2:30 – 3:00 Practice exercises 06-practice-exercises

Setup

pip install langchain langchain-openai langchain-cohere sentence-transformers

When to apply advanced RAG

Each technique adds latency and complexity. Apply them when you have evidence of the specific failure they fix: low faithfulness → better grounding prompt; low context precision → reranking; low context recall → multi-query or HyDE. Don't add advanced techniques preemptively.

← LangChain Fundamentals | Start →