LanceDB MCP server
A maintained MCP server for agentic RAG over a local LanceDB index: hybrid search across a document catalog and its chunks.
This LanceDB MCP server gives an AI agent agentic retrieval-augmented generation and hybrid search over documents stored in a local LanceDB index. It is built around a two-level data model: a catalog of document-level summaries and a chunk store of the underlying passages. The agent gets three focused tools — search the catalog for relevant documents, find the relevant chunks of a specific document, and find relevant chunks across all known documents — so it can first locate the right source and then drill into the exact passages to ground its answer. Because LanceDB is an embedded, on-disk vector store and embeddings are produced locally, document content never leaves your machine, which makes it a good fit for private knowledge bases and offline RAG.
The server is published to npm and runs locally over stdio, launched with npx (npx lance-mcp PATH_TO_LOCAL_INDEX_DIR), pointing it at the directory that holds your LanceDB index. It uses Ollama for local embedding and summarization (the README pulls snowflake-arctic-embed2 and llama3.1:8b), so you need Ollama running with those models. You build the index ahead of time with the project's seed command, which embeds a folder of documents into the catalog and chunk tables; from then on the agent queries that index. It is MIT-licensed and actively maintained.
Quick install
Copy-paste configs are provided for all 8 supported clients. Pick your client below.
Available tools
| Tool | Description |
|---|---|
| catalog_search | Searches for relevant documents in the catalog of document-level summaries. |
| chunks_search | Finds relevant chunks based on a specific document from the catalog. |
| all_chunks_search | Finds relevant chunks across all known documents. |
What you can do with it
Offline RAG over private documents
Seed a local LanceDB index from a folder of documents, then let the agent search the catalog and pull the relevant chunks to ground its answers — all on-disk, with embeddings computed locally via Ollama so nothing leaves your machine.
Two-stage retrieval
The agent first finds the right document with catalog_search, then narrows to the exact passages with chunks_search, or searches every document at once with all_chunks_search for broader questions.
FAQ
- Is it free?
- Yes. The server is open source under the MIT license, LanceDB is an embedded open-source vector store, and the local Ollama models are free to run. There are no hosted costs unless you choose to add them.
- Does it support remote/OAuth?
- No. It runs locally over stdio (via npx) against an on-disk LanceDB index and uses local Ollama models for embeddings, so there is no hosted endpoint and nothing to authenticate.
- Do I need to prepare the index first?
- Yes. You build the index ahead of time with the project's seed command, which embeds a directory of documents into the catalog and chunk tables. You also need Ollama running with the embedding and summarization models the README specifies.