RAG debugging follows a systematic process.
Step : Check retrieval
Are relevant chunks retrieved? If not: chunking, embedding, or indexing issue.
Step : Check context
Does retrieved content contain the answer? If not: data gap.
Step : Check generation
Does LLM use context correctly? If not: prompt issue.
Common problems:
- Chunks too small (missing context)
- Chunks too large (irrelevant noise)
- Wrong embedding model for domain
Interview question: "How debug a RAG system?"
Isolate stages. Check retrieval first. Use human evaluation on samples.