Architecture

Induction Heads

2022ResearchUpdated: 6 May 2026Published

Key innovation

Identified a specific type of attention head as the mechanistic substrate of in-context learning in Transformer models, linking mechanistic interpretability to emergent behavior.

How it works

An induction head consists of two attention heads acting sequentially. Head 1 (Q-K-matching) copies the token preceding the current token into its representation. Head 2 (induction) uses this copied information to attend to previous occurrences of the current token and predict the next token based on what followed them before. The mechanism emerges at a precise training point correlated with ICL acquisition.

Problem solved

Lack of understanding of the mechanisms behind Transformer models' ability for in-context learning - a key emergent property discovered in GPT-3.

Evolution

Original paper · 2022 · arXiv 2022 · Catherine Olsson

In-context Learning and Induction Heads

Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova DasSarma, Tom Henighan, Ben Mann, Dario Amodei, Chris Olah

2021

A Mathematical Framework for Transformer Circuits (Elhage et al.)

Inflection point

Elhage et al. lay groundwork for mechanistic interpretability, identifying attention head composition.

2022

Discovery of Induction Heads (Olsson et al.)

Inflection point

Olsson et al. identify induction heads as the mechanistic source of in-context learning.

2023

Extension to larger models (Nanda et al., Anthropic)

Follow-up work extends mechanistic interpretability findings to larger language models.

Sources

In-context Learning and Induction Heads

Paper

A Mathematical Framework for Transformer Circuits

Paper