Abstraction and Reasoning Corpus for AGI
The only benchmark measuring "fluid intelligence" in AI β the ability to abstract and reason on entirely novel tasks based solely on core knowledge priors (shared by humans), without the ability to "buy" scores through massive training data.
Each task consists of 2-5 demonstration pairs (colored pixel grids: input β output) and one or more test cases. The system must discover the rule governing the transformation and apply it. Answers are digital grids (up to 30x30 pixels, 10 colors). Scoring: binary success/fail per task; score is % of solved tasks.
Absence of a benchmark resistant to "buying scores" through massive training data; existing benchmarks measured stored knowledge (crystallized intelligence) instead of general reasoning ability (fluid intelligence) β preventing assessment of progress toward AGI.
Common pitfalls
Gap between training set and private test setHIGH
Good scores on the public test set do not guarantee good performance on the private test set (ARC Prize evaluation).
Evaluate exclusively on the private set through the official ARC Prize competition.
Overfitting to known tasksCRITICAL
Systems trained on known ARC tasks may overfit to their specific patterns without demonstrating genuine reasoning.
Use new tasks (ARC-AGI-2/3) and evaluate on the private test set.
Reference implementations
ARC and "On the Measure of Intelligence" paper published
breakthroughFrancois Chollet defines intelligence as skill-acquisition efficiency and introduces the ARC benchmark.
ARC Prize 2024 β first systems exceed 55% on private test set
breakthroughPublic Kaggle competition with $1M prize pool attracts hundreds of teams; LLM+program synthesis hybrids exceed 55%.
ARC-AGI-2 and ARC-AGI-3 β new, harder versions
ARC Prize Foundation releases new benchmark versions with harder tasks as models begin saturating ARC-AGI-1.
Pixel-grid benchmark; evaluation is hardware-agnostic although solver programs may leverage GPU.
| Title | Publisher | Type |
|---|---|---|
| On the Measure of Intelligence | arXiv | scientific article |
| ARC Prize β official website | ARC Prize Foundation | official website |
| ARC-AGI GitHub Repository | GitHub | repository |