Shared memory for Llama Stack

Llama Stack standardizes the building blocks of an agent application, inference, safety, tool calling, agents, and a memory API, behind a single set of provider-agnostic interfaces so you can move between local and hosted backends without rewriting your app. The memory API gives you a clean way to attach vector stores and retrieval to an agent, but it leaves the hard part to you: what gets remembered, when, and whether that knowledge is shared with the rest of your fleet rather than trapped in one agent's index. Glen, shared memory for AI agents, gives Llama Stack agents a durable, organization-wide memory as a single MCP tool, so retrieval and writing both happen in one round trip against a store every agent shares.

Expose Glen to your Llama Stack agents over MCP and a turn can call one tool that retrieves relevant long-term context and records new observations together. Before the agent reasons, it pulls what the organization already knows; after it acts, it writes back what it learned. Instead of provisioning and tuning a vector-store provider, deciding on chunking and embeddings, and hand-building an ingestion path just to give the agent persistent memory, you point it at one MCP server and let Glen handle relevance and storage. The memory API still has its place for document retrieval, but Glen is the durable, evolving fact base your agents accumulate over time.

The decisive difference is scope. A Llama Stack memory bank is attached to an agent or a session; Glen is org-scoped, so the same memory spans every agent, every Llama Stack deployment, and every other framework your organization runs. One agent learns a durable fact and a completely different agent reads it next time, no copying indexes between processes. Because Glen speaks standard MCP, the knowledge your Llama Stack agents write is equally readable from Claude Code, Cursor, or any MCP client, so human and agent work share one source of truth. Connect once over OAuth or an API key and the organizational memory compounds across every run.

FAQ

How is Glen different from Llama Stack's built-in memory API?
Llama Stack's memory API gives you provider-agnostic document retrieval scoped to an agent or session. Glen is durable, org-shared long-term memory that every agent in your organization reads and writes, and that combines retrieval and writing in one MCP call.
How do Llama Stack agents reach Glen?
Glen is a standard MCP server. Register it as an MCP tool for your agent and call it during a turn to retrieve context and record observations in a single round trip.