What is an embedding?

An embedding is a vector of numbers that captures the meaning of a piece of text or other data, positioning semantically similar items close together so software can compare them by similarity.

An embedding is the output of a model that maps a piece of content, a sentence, a document chunk, an image, into a fixed-length vector of numbers, often hundreds or thousands of dimensions long. The key property is geometric: items with similar meaning land near each other in this high-dimensional space, and dissimilar items land far apart, with closeness usually measured by cosine similarity. This is what makes meaning computable. To build semantic search or retrieval-augmented generation you embed your documents once and store the vectors, then at query time embed the question with the same model and find the stored vectors nearest to it, which surfaces the most relevant passages even when no keywords match. Embeddings are produced by dedicated embedding models (separate from the chat model) and are the unit a vector database is built to store and index. Because comparisons only make sense between vectors from the same model, consistency matters: you must embed queries with the same model you embedded your corpus with. In agent systems, embeddings underpin how a memory or documentation store decides which snippets are relevant to pull back into context.