RAG Stack
Chat grounded in your own documents — retrieval-augmented generation with citations, not hallucinations.
What it is
RAG = find the relevant chunks of your data first, then ask the model to answer using only those. It’s how you get AI answers you can trust, with a source to point at.
How it fits together
Question
│
▼
Embed ──► Qdrant (find the top-k relevant chunks)
│
▼
Claude API (answer using ONLY those chunks) ──► answer + citation
Documents are chunked and embedded into Qdrant ahead of time.
The pieces
- Qdrant — vector search over your chunks.
- Claude API — the answer, constrained to retrieved context.
- Supabase — store documents, users, and chat history.
- LangGraph (optional) — when retrieval becomes multi-step or agentic.
When to use it
Any time the AI must answer from your content, not its training data.
Build it
Start from the chatbot workflow, then add the retrieval step before the model call.