LlamaIndex legal-kb Reference App Implements Agentic Retrieval Harness with Filesystem-Style Tools
LlamaIndex released legal-kb, a reference application demonstrating a Retrieval Harness for agentic workflows using LlamaIndex Index v2. The system provides agents with filesystem-style tools like grep and read to systematically navigate documents rather than relying on naive single-shot Retrieval-Augmented Generation.
Why it matters
Moving from naive RAG to multi-step agentic retrieval allows LLMs to systematically inspect, verify, and cite large documents with higher precision, reducing hallucinations in critical domains like legal and finance.
TL;DR
- 01LlamaIndex released legal-kb, demonstrating a filesystem-style Retrieval Harness for agents.
- 02The system prompt enforces a rigorous protocol: findFiles, retrieve, and then readFile/grepFile.
- 03The stack uses TanStack Start, Vercel AI SDK 6, Prisma, WorkOS, and PostgreSQL.
Key facts
- Stack
- TanStack Start, AI SDK 6, Prisma, WorkOS, PostgreSQL
- Search Modes
- Hybrid semantic, keyword, regex grep, file search
- API Interface
- LlamaIndex Index v2 (LlamaCloud)
File-System Operations for Agents
Instead of a single-shot vector query, the legal-kb harness equips agents with four fundamental tools: retrieve, findFiles, readFile, and grepFile. By mimicking terminal commands, the agent can programmatically search document hierarchies, check specific file offsets using readFile (via beta.retrieval.read), or scan files for specific patterns using a regex-based grepFile (backed by beta.retrieval.grep).
Structured Search Workflows
The reference application uses system instructions to enforce a rigorous retrieval protocol. The agent is directed to first list available files via findFiles to establish an inventory. It then refines the scope with semantic retrieval and verifies exact wordings using reading and grepping tools before generating citations. Bytes are pushed to LlamaCloud, while corresponding records are written to PostgreSQL using Prisma, with background processes syncing index versioning.
Developer Stack
The implementation is built using TanStack Start, Vercel AI SDK 6, Prisma, WorkOS, and PostgreSQL, utilizing per-user encrypted keys. The app utilizes the ToolLoopAgent from Vercel AI SDK, allowing developers to swap OpenAI models (using medium reasoning effort) or Anthropic models (utilizing extended thinking) dynamically depending on performance and budget needs.