What is few-shot prompting?

Few-shot prompting steers an LLM by including a handful of input-output examples in the prompt, letting the model infer the desired format and behavior at inference time without any fine-tuning.

Few-shot prompting (also called in-context learning) gives the model a small number of demonstrations, typically two to a dozen, of the task it should perform, then asks it to continue the pattern on a new input. Because large models generalize from examples in their context window, a few well-chosen demonstrations can lock in a tricky output format, a tone, a labeling scheme, or an edge-case rule far more reliably than prose instructions alone. It sits on a spectrum: zero-shot uses instructions only, one-shot uses a single example, and few-shot uses several. Example selection matters, demonstrations should be diverse, correct, and representative of hard cases, and ordering can subtly bias results. The cost is tokens: every example is sent on every call, so heavy few-shot prompts are good candidates for prompt caching to amortize the fixed prefix. When examples grow numerous or the task is stable and high-volume, fine-tuning can be cheaper at runtime, but few-shot remains the fastest way to iterate because it requires no training run. It also composes with other techniques, few-shot chain-of-thought examples teach the model both the format and the reasoning style to use.