Robots Atlas>ROBOTS ATLAS
Architecture

Induction Heads

2022ResearchUpdated: 6 May 2026Published
Key innovation
Identified a specific type of attention head as the mechanistic substrate of in-context learning in Transformer models, linking mechanistic interpretability to emergent behavior.
Category
Architecture
Abstraction level
Building block
Operation level
LayerInference
Use cases
Mechanistic interpretability of TransformersIn-context learning researchAnalysis of emergent model capabilitiesNeural network debugging and understanding

How it works

An induction head consists of two attention heads acting sequentially. Head 1 (Q-K-matching) copies the token preceding the current token into its representation. Head 2 (induction) uses this copied information to attend to previous occurrences of the current token and predict the next token based on what followed them before. The mechanism emerges at a precise training point correlated with ICL acquisition.

Problem solved

Lack of understanding of the mechanisms behind Transformer models' ability for in-context learning - a key emergent property discovered in GPT-3.

Evolution

Original paper ยท 2022 ยท arXiv 2022 ยท Catherine Olsson
In-context Learning and Induction Heads
Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova DasSarma, Tom Henighan, Ben Mann, Dario Amodei, Chris Olah
2021
A Mathematical Framework for Transformer Circuits (Elhage et al.)
Inflection point

Elhage et al. lay groundwork for mechanistic interpretability, identifying attention head composition.

2022
Discovery of Induction Heads (Olsson et al.)
Inflection point

Olsson et al. identify induction heads as the mechanistic source of in-context learning.

2023
Extension to larger models (Nanda et al., Anthropic)

Follow-up work extends mechanistic interpretability findings to larger language models.