Legal Q&A RAG Chatbot

AI Engineer

2025–2026

Demo Soon

Quick Links

On this page

Production-ready RAG chatbot for UK legal queries, processing 131,253+ chunks with sub-3s latency, hybrid retrieval (BM25 + FAISS + RRF), enterprise auth, and guardrails.

Problem

Legal teams need fast, reliable answers grounded in legislation and contracts. Generic LLMs hallucinate and cannot safely reason over large legal corpora without strong retrieval, filtering, and citation enforcement.

Solution

Built a custom RAG pipeline from scratch (no frameworks), implementing hybrid retrieval: BM25 keyword search + FAISS semantic vector search combined using Reciprocal Rank Fusion (RRF). Improved retrieval accuracy by 15–20% using cross-encoder re-ranking (ms-marco-MiniLM-L-6-v2). Developed a FastAPI backend with dual-mode responses (Solicitor Mode for technical output; Public Mode for plain-language explanations). Added comprehensive guardrails: domain filtering, citation enforcement, and PII redaction. Implemented enterprise authentication (JWT + OAuth2: Google/GitHub/Microsoft) with role-based access control (RBAC). Built a private document corpus system for user uploads and combined public/private retrieval using RRF. Deployed with Docker, PostgreSQL with Alembic migrations, Streamlit frontend with protected routes, structured logging, health checks, and metrics collection. Achieved 680+ embeddings/sec throughput via batch processing optimization and delivered 108+ end-to-end tests.

Key Results

  • 131,253+ document chunks indexed
  • Sub-3-second average response latency
  • 15–20% retrieval accuracy improvement via RRF + cross-encoder reranking
  • 680+ embeddings/second throughput via batch processing optimization
  • 40% reduction in hallucinations via citation enforcement + guardrails
  • Enterprise auth: JWT + OAuth2 + RBAC (Google/GitHub/Microsoft)
  • Private document corpus + public/private fusion with RRF
  • PostgreSQL + Alembic migrations; 108+ end-to-end tests

Tech Stack

PythonFastAPIDockerPostgreSQLAlembicStreamlitFAISSBM25RRFCross-encoder reranking (ms-marco-MiniLM-L-6-v2)OpenAI embeddings/LLMJWTOAuth2RBACStructured loggingHealth checksMetrics/monitoring