Induction Heads
How it works
An induction head consists of two attention heads acting sequentially. Head 1 (Q-K-matching) copies the token preceding the current token into its representation. Head 2 (induction) uses this copied information to attend to previous occurrences of the current token and predict the next token based on what followed them before. The mechanism emerges at a precise training point correlated with ICL acquisition.
Problem solved
Lack of understanding of the mechanisms behind Transformer models' ability for in-context learning - a key emergent property discovered in GPT-3.
Evolution
Elhage et al. lay groundwork for mechanistic interpretability, identifying attention head composition.
Olsson et al. identify induction heads as the mechanistic source of in-context learning.
Follow-up work extends mechanistic interpretability findings to larger language models.