What is approximate nearest neighbor (ANN)?

Approximate nearest neighbor is a class of algorithms that find the vectors most similar to a query quickly by trading a little accuracy for huge speed gains, making vector search practical at scale.

Approximate nearest neighbor, ANN, is how vector databases search billions of embeddings fast. An exact nearest-neighbor search compares the query against every stored vector, which is accurate but linear in the dataset size and far too slow for large corpora answered in milliseconds. ANN algorithms build an index that lets the system find almost all of the truly closest vectors while examining only a small fraction of the data, accepting that it may occasionally miss a true neighbor in exchange for orders-of-magnitude speedups. The dominant approach is HNSW (Hierarchical Navigable Small World), a layered graph you traverse greedily toward the query; other families include IVF (inverted file partitioning) and product quantization, which compresses vectors to shrink memory. Each exposes a recall-versus-speed knob: tune it for more accuracy and you pay in latency and memory, or vice versa. This is the core machinery beneath semantic search and RAG retrieval, and it is what dedicated vector stores like Pinecone, Qdrant, and Weaviate are built around, so an agent calling them through an MCP tool gets fast similarity lookups without managing the index itself.