What is working memory?

Working memory is the information an AI agent actively holds for the task in front of it, the current goal, recent steps, and tool results, kept live in the context window and discarded once the task is done.

Working memory is the agent's scratchpad: the small, task-relevant set of information it keeps in mind while solving the problem right now. It is borrowed from cognitive science, where working memory is the limited store you use to hold and manipulate information during a task, as opposed to the vast long-term store you draw facts from. For an AI agent the working memory lives in the context window, the current objective, the last few tool calls and their results, a partial plan, intermediate reasoning, and it is inherently transient: when the task or session ends, it is gone. Working memory overlaps with short-term memory but emphasizes active use rather than mere recency; it is what the agent is operating on, not just what happened lately. Because the context window is finite, managing working memory is a core part of context engineering: keep what the current step needs in view, summarize or offload what it does not. The durable counterpart is long-term memory, where facts worth keeping are written to a persistent store and retrieved back into working memory only when a later task makes them relevant.