Open reasoning model from DeepSeek (January 2025). 671B MoE with 37B active parameters, trained via pure RL with verifiable rewards (GRPO) on top of DeepSeek-V3.
Context window
128K
tokens
Parameters
671B (37B active)
parameters
Max output
32,768
tokens
Release date
20 January 2025
Access:APIDownloadHostedDeployment:โ Cloud๐ป Local
Overview
Applications
Access & deployment
APIDownloadHosted
CloudLocal
Weights: Open weights
Key parameters
๐ Context: 128K
๐งฉ Parameters: 671B (37B active)
โ Toolsย ยทย โ Fine-tuning
๐ฅ Input: text
Technical specification
Context window
128K
tokens
Parameters
671B (37B active)
parameters
Max output tokens
32,768
tokens per response
Knowledge cutoff
1 Jul 2024
Knowledge boundary
License
MIT
Hardware requirements
The full model requires a multi-GPU cluster (typically 8รH100 80 GB or larger). Distilled variants (1.5Bโ70B) run on a single consumer/data-center GPU.
Features:โ Tool useโ Fine-tuning
Modalities
โฌ Input
text
โฌ Output
textcode
Capabilities and applications
Native model capabilities
Reasoning
Category: reasoning
Multi-step reasoning
Category: reasoning
Coding
Category: coding
Planning
The model's ability to determine a sequence of actions leading to a goal โ predicting the consequences of actions and selecting an optimal path in a given environment.
Category: planning
Long context
The model's ability to handle long context and maintain coherence over a large amount of input data.
Category: reasoning
Language modeling
Ability to predict subsequent tokens and generate coherent natural-language text based on the preceding context.
Category: language
Agentic capability
The model's ability to autonomously plan and execute multi-step tasks by sequentially using tools, maintaining context, and adapting to intermediate results.
Category: planning
Structured output
Category: structured_generation
Diagram reasoning
Category: reasoning
Function Calling
Category: planning
Benchmark results
8 benchmarks
AIME 2024
pass@1 ยท cons@64 (majority voting)
79.8%
๐ DeepSeek-R1 paper (arXiv:2501.12948)
MATH
pass@1 ยท MATH-500 subset
97.3%
๐ DeepSeek-R1 paper (MATH-500)
Codeforces
percentile ยท 2,029 ELO equivalent
96.3percentile
๐ DeepSeek-R1 paper
MMLU
accuracy ยท pass@1
90.8%
๐ DeepSeek-R1 paper
GPQA
pass@1 ยท GPQA Diamond
71.5%
๐ DeepSeek-R1 paper
LiveCodeBench
pass@1 ยท COT@8
65.9%
๐ DeepSeek-R1 paper
SWE-bench
resolved ยท SWE-bench Verified
49.2%
๐ DeepSeek-R1 paper
MMLU-Pro
EM ยท Exact Match
84.0%
๐ DeepSeek-R1 paper
Technical architecture
Core Architecture
Model Form
Training Techniques
Deployment and security
โ Available on platforms
