What is prompt chaining?

Prompt chaining decomposes a task into a fixed sequence of LLM calls, where each step's output feeds the next, trading a single complex prompt for several focused ones that are easier to control and debug.

Prompt chaining is a workflow pattern where a task is split into an ordered series of model calls instead of asking one giant prompt to do everything. A typical chain might first extract structured fields from a document, then validate them, then draft a summary from the validated fields, then translate it. Because each step has one job, prompts stay short and specific, outputs are easier to constrain with schemas, and you can insert programmatic checks or gates between steps. This makes chains markedly more reliable than a monolithic prompt for multi-stage work, at the cost of higher latency and more tokens. Prompt chaining differs from a fully agentic loop: the control flow is decided in advance by the developer, not chosen at runtime by the model. It contrasts with patterns like ReAct, where the model itself decides the next action. Chaining is often the right starting point because it is deterministic and observable; you reach for agent loops only when the path through the task genuinely cannot be predicted ahead of time. Many production systems blend the two, using a chain as scaffolding with one agentic step embedded inside.