Best MCP servers for vector search & RAG

Retrieval-augmented generation depends on a vector database: you embed your documents, store them, and at query time pull back the most semantically relevant chunks to ground the model's answer. A vector MCP server lets an agent store and retrieve from that database directly, whether you want a tiny semantic-memory layer or full control over collections, metadata filters, and reranking. The servers below are the three leading vector stores, each official, and the right pick depends on whether you are running managed cloud, self-hosting, or want an embedded local database. Match the server to where your vectors live; each ships a verified, current install config.

Top pick

Qdrant

Official

Qdrant's official MCP server: a semantic memory layer that stores and retrieves information from a Qdrant vector database.

vector-search

Qdrant's official server is a deliberately small semantic-memory layer, two tools to store and retrieve information from a Qdrant database, ideal as a drop-in agent memory or simple RAG back end.

Qdrant for vector search & RAG →

Pick 2

Pinecone

Official

Pinecone's official developer MCP server: search indexes, manage records, rerank results, and look up Pinecone docs from your agent.

vector-search

Pinecone's official developer server searches indexes, manages records, and reranks results, the choice when your RAG pipeline runs on Pinecone's managed cloud.

Pinecone for vector search & RAG →

Pick 3

Chroma

Official

Chroma's official MCP server: manage collections and run semantic, metadata, and full-text search over a Chroma vector database.

vector-search

Chroma's official server exposes the full database surface, managing collections and running semantic, metadata, and full-text search, well suited to self-hosted or embedded local RAG.

Chroma for vector search & RAG →