MCP glossary

Plain-English definitions of the terms you'll meet across the Model Context Protocol ecosystem.

Agent memory

Agent memory is persistent context that an AI agent can write to and read back across sessions, so it remembers facts, decisions, and preferences instead of starting cold every conversation.

Agent orchestration

Agent orchestration is the coordination of multiple AI agents or steps toward a goal, deciding which agent or tool runs when, how results pass between them, and how shared state and memory are kept in sync.

Agentic workflow

An agentic workflow is a multi-step process driven by an AI agent that chains tool calls, decisions, and intermediate results to accomplish a task, rather than relying on a single model response.

AI agent

An AI agent is a system built around a language model that can pursue a goal over multiple steps, deciding which tools to call, observing results, and adjusting, rather than producing a single one-shot answer.

Bearer token

A bearer token is a credential that grants access to whoever holds it, sent in the HTTP Authorization header; remote MCP servers accept one as a simpler alternative to a full OAuth flow.

Capability negotiation (MCP)

Capability negotiation is the MCP initialization handshake where client and server each declare which features they support, so both sides only use functionality the other side actually implements.

Chunking

Chunking is splitting a large document into smaller passages before embedding it, so retrieval can return focused, relevant pieces that fit a model's context window instead of whole files.

Coding agent

A coding agent is an AI agent specialized for software work; it reads a codebase, edits files, runs commands and tests, and iterates toward a goal, usually inside an IDE or terminal.

Context engineering

Context engineering is the practice of deliberately curating what goes into a model's context window, instructions, tools, retrieved data, and memory, so the model has exactly what it needs and nothing that distracts it.

Context window

A context window is the maximum amount of text, measured in tokens, that a language model can consider at once, covering the prompt, conversation history, retrieved data, and the model's own output.

Dynamic Client Registration

Dynamic Client Registration (DCR) is the OAuth mechanism that lets an MCP client register itself with a server's authorization server at runtime, so users do not have to manually create client credentials.

Elicitation (MCP)

Elicitation is a Model Context Protocol feature that lets a server pause mid-operation to ask the user for specific structured input, rather than failing or guessing when it needs more information.

Embedding

An embedding is a vector of numbers that captures the meaning of a piece of text or other data, positioning semantically similar items close together so software can compare them by similarity.

Episodic memory

Episodic memory is an agent's record of specific past events, what happened in a particular session, when, and in what order, so it can recall and learn from concrete experiences rather than only general facts.

FastMCP

FastMCP is a Python (and TypeScript) framework that lets you build MCP servers by decorating ordinary functions, handling the protocol's transport, schema generation, and lifecycle so you write tools instead of plumbing.

Function calling

Function calling is the model-API feature that lets a language model return a structured request to invoke a named function with JSON arguments, instead of plain text; it is the foundation tool calling builds on.

Hybrid search

Hybrid search combines semantic (vector) search with keyword (lexical) search and merges the results, capturing both meaning-based matches and exact terms like product names or error codes.

JSON-RPC

JSON-RPC is a lightweight remote-procedure-call protocol that encodes requests and responses as JSON objects; the Model Context Protocol uses JSON-RPC 2.0 as its wire format.

Knowledge graph

A knowledge graph stores information as entities (nodes) and the relationships (edges) between them, letting an agent traverse connections, like which person owns which service, rather than just matching text.

Local MCP server

A local MCP server runs as a process on your own machine, usually launched by the host over the stdio transport, so it can touch local files, a local Git checkout, or databases on your network.

Long-term memory (agents)

Long-term memory is the durable store an AI agent writes facts and experiences to so they survive across sessions, retrieved back into context only when a later task needs them, the opposite of the transient context window.

MCP client

An MCP client is the AI application, such as Claude Code, Cursor, or VS Code, that connects to MCP servers, discovers their tools, and lets the model call them on the user's behalf.

MCP gateway

An MCP gateway is a proxy that sits between agents and many MCP servers, presenting one endpoint while it handles routing, authentication, access control, and observability for the servers behind it.

MCP host

An MCP host is the application a user actually interacts with, like Claude Desktop, Cursor, or an IDE, that embeds one or more MCP clients and lets the model use connected servers.

MCP Inspector

MCP Inspector is the official developer tool for testing MCP servers: it connects to a server, lists its tools, resources, and prompts, and lets you invoke them interactively to debug behavior.

MCP prompt

An MCP prompt is a reusable, parameterized message template an MCP server offers, typically surfaced as a slash command or menu item the user picks to kick off a structured task.

MCP registry

An MCP registry is a catalog of available MCP servers, with metadata like install commands and capabilities, that helps users and hosts discover, vet, and connect to servers without hunting across repos.

MCP resource

An MCP resource is read-only data an MCP server exposes by URI, like a file, a database row, or a document, that the host can load into the model's context without the model taking an action.

MCP roots

Roots are a Model Context Protocol primitive where the client tells the server which filesystem or URI boundaries it is allowed to operate within, scoping a server's access to a defined set of locations.

MCP sampling

Sampling is a Model Context Protocol feature that lets a server request a completion from the client's language model, so the server can use the model's reasoning without holding its own API key.

MCP server

An MCP server is a program that exposes tools, resources, and prompts to AI agents over the Model Context Protocol, giving a model a uniform way to read data or take actions in an external system.

MCP session

An MCP session is a single stateful connection between a client and server, from the initialize handshake to disconnect, over which negotiated capabilities, context, and requests persist.

MCP tool

An MCP tool is a named, schema-described action that an MCP server exposes for a model to call, like creating an issue or running a query; the model invokes it and the server runs the work.

mcp-remote

mcp-remote is a bridge utility that lets MCP hosts which only speak the local stdio transport connect to remote, OAuth-protected MCP servers, handling the HTTP transport and sign-in flow on their behalf.

mcpServers config

The mcpServers config is the JSON block, used by Claude Desktop, Cursor, Cline, and others, that registers MCP servers with a client by naming each one and giving its command, args, env, or URL.

Memory store

A memory store is the durable backend where an AI agent's long-term memory actually lives, the database or service that persists facts and observations and serves the relevant ones back into context on demand.

Model Context Protocol (MCP)

The Model Context Protocol (MCP) is an open standard that lets AI applications connect to external tools, data, and services through a uniform interface, so any compliant client can use any compliant server.

Multi-agent system

A multi-agent system is an AI setup where several agents, often specialized, work together on a task, dividing the work, passing results between each other, and ideally sharing memory so their understanding stays consistent.

npx

npx is the Node.js package runner that downloads and executes an npm package in one step, which is why most local MCP servers are launched with a command like npx -y some-mcp-server.

OAuth for MCP

OAuth for MCP is how remote MCP servers authorize users: the spec adopts OAuth 2.1 so each user signs in and grants scoped access, instead of pasting a long-lived secret into a config file.

Persistent memory

Persistent memory is information an AI agent stores durably so it survives across sessions, letting the agent recall earlier facts and decisions instead of losing everything when the conversation ends.

PKCE

PKCE (Proof Key for Code Exchange) is an OAuth 2.1 extension that stops stolen authorization codes from being redeemed, by binding the code to a secret the client proves it knows at token exchange.

Progress notification (MCP)

A progress notification is an MCP message a server sends during a long-running operation to report incremental progress, so the client can show status instead of waiting blindly for the final result.

RAG vs MCP

RAG and MCP solve different layers: RAG is a technique for retrieving relevant text and injecting it into a prompt, while MCP is a protocol for connecting models to tools and data sources, including RAG retrievers.

Remote MCP server

A remote MCP server runs as a hosted service at a URL and connects over Streamable HTTP, usually with OAuth, so multiple users and machines can share one always-on integration.

Reranking

Reranking is a second retrieval pass that reorders an initial set of candidate results by relevance using a more accurate model, so the best passages rise to the top before they reach the agent.

Retrieval-augmented generation (RAG)

RAG is a technique that retrieves relevant passages from an external knowledge source and inserts them into the model's prompt, so the answer is grounded in your data rather than only the model's training.

Semantic memory

Semantic memory is an agent's store of general, timeless facts, conventions, preferences, and how-things-work, abstracted away from when or how they were learned, so the agent knows things rather than just recalling events.

Semantic search

Semantic search finds results by meaning rather than exact keywords, comparing vector embeddings of the query and documents so it surfaces relevant matches even when the wording differs.

Shared agent memory

Shared agent memory is a memory store that multiple agents or teammates read from and write to in common, so knowledge one agent learns is instantly available to every other agent on the team.

Short-term memory (agents)

Short-term memory is the recent context an AI agent holds during the current session, the conversation so far and latest tool results, that lives in the context window and is lost when the session ends.

SSE transport

SSE transport is the older MCP remote transport that paired Server-Sent Events for server-to-client streaming with HTTP POST for client requests; it has been superseded by Streamable HTTP.

stdio transport

The stdio transport runs an MCP server as a local subprocess and exchanges protocol messages over its standard input and output streams, the default way to run a local MCP server.

Streamable HTTP

Streamable HTTP is the MCP transport for remote servers, carrying protocol messages over HTTP with streaming responses; it superseded the older HTTP+SSE transport.

Structured output

Structured output is machine-readable data returned in a defined shape, such as JSON validated against a schema, so a program or agent can parse it reliably instead of scraping free-form text.

Tool calling

Tool calling is the pattern where a language model, given a set of described tools, decides to invoke one with structured arguments; the system runs it and feeds the result back into the conversation.

uvx

uvx is the package runner from the uv Python toolchain that fetches and runs a Python tool in an ephemeral environment, making it the common way to launch Python-based local MCP servers.

Vector database

A vector database stores data as high-dimensional embeddings and finds items by similarity rather than exact match, making it the storage layer behind semantic search and retrieval-augmented generation.

Well-known URI

A well-known URI is a standardized path under /.well-known/ where a server publishes metadata for automatic discovery; MCP clients fetch these to learn a remote server's OAuth configuration.

Working memory

Working memory is the information an AI agent actively holds for the task in front of it, the current goal, recent steps, and tool results, kept live in the context window and discarded once the task is done.