Robots Atlas>ROBOTS ATLAS
DeepSeek-R1

DeepSeek-R1

R1ย ยทย Family: DeepSeek
Open reasoning model from DeepSeek (January 2025). 671B MoE with 37B active parameters, trained via pure RL with verifiable rewards (GRPO) on top of DeepSeek-V3.
โœ“ Activeโœ“ Public accessโš– Open weightsโ˜… FeaturedReasoning modelLLM๐Ÿ“ DeepSeek
Context window
128K
tokens
Parameters
671B (37B active)
parameters
Max output
32,768
tokens
Release date
20 January 2025
Access:APIDownloadHostedDeployment:โ˜ Cloud๐Ÿ’ป Local

Overview

DeepSeek-R1 is an open reasoning model released by DeepSeek-AI on January 20, 2025. Architecturally it is a Mixture-of-Experts with 671 billion total parameters and 37B active per token, built on top of DeepSeek-V3-Base. R1 was produced via a Reasoning RL pipeline with verifiable rewards โ€” GRPO (Group Relative Policy Optimization), where rewards come from rules (math correctness, code execution, format) rather than a learned reward model. Context: 128,000 tokens. Licence: MIT on the weights, the model is publicly available on Hugging Face.

Released alongside R1 was R1-Zero โ€” a variant trained with pure RL without any SFT cold-start, proving that Reasoning RL can elicit long chain-of-thought and self-correction directly from RL. Production R1 added a short SFT cold-start on a few hundred CoT examples for more readable outputs. A series of distilled variants on Llama 3.1 and Qwen 2.5 (1.5B, 7B, 8B, 14B, 32B, 70B) was also published, porting much of R1's capability to single-GPU models.

Results: AIME 2024 79.8% pass@1, MATH-500 97.3%, Codeforces 96.3 percentile, MMLU 90.8%, GPQA Diamond 71.5%, LiveCodeBench 65.9%, SWE-bench Verified 49.2% โ€” on par with or above OpenAI o1 at a fraction of inference cost. The model is available via the DeepSeek API, Hugging Face, Together AI, Fireworks, OpenRouter, Amazon Bedrock Marketplace and Vertex AI Model Garden. DeepSeek-R1 โ€” together with the GRPO algorithm publication โ€” established the de-facto standard for open Reasoning RL and triggered a wave of reproductions (TinyZero, Open-R1, SimpleRL).

Classification
Reasoning modelLLM
Family: DeepSeek
Access & deployment
APIDownloadHosted
CloudLocal
Weights: Open weights
Key parameters
๐Ÿ“ Context: 128K
๐Ÿงฉ Parameters: 671B (37B active)
โœ“ Toolsย ยทย โœ“ Fine-tuning
๐Ÿ“ฅ Input: text

Technical specification

Context window
128K
tokens
Parameters
671B (37B active)
parameters
Max output tokens
32,768
tokens per response
Knowledge cutoff
1 Jul 2024
Knowledge boundary
License
MIT
Hardware requirements
The full model requires a multi-GPU cluster (typically 8ร—H100 80 GB or larger). Distilled variants (1.5Bโ€“70B) run on a single consumer/data-center GPU.
Features:โœ“ Tool useโœ“ Fine-tuning
Modalities
โฌ‡ Input
text
โฌ† Output
textcode

Capabilities and applications

Native model capabilities
Reasoning
Category: reasoning
Multi-step reasoning
Category: reasoning
Coding
Category: coding
Planning
The model's ability to determine a sequence of actions leading to a goal โ€” predicting the consequences of actions and selecting an optimal path in a given environment.
Category: planning
Long context
The model's ability to handle long context and maintain coherence over a large amount of input data.
Category: reasoning
Language modeling
Ability to predict subsequent tokens and generate coherent natural-language text based on the preceding context.
Category: language
Agentic capability
The model's ability to autonomously plan and execute multi-step tasks by sequentially using tools, maintaining context, and adapting to intermediate results.
Category: planning
Structured output
Category: structured_generation
Diagram reasoning
Category: reasoning
Function Calling
Category: planning

Benchmark results

8 benchmarks
AIME 2024
pass@1 ยท cons@64 (majority voting)
79.8%
๐Ÿ“„ DeepSeek-R1 paper (arXiv:2501.12948)
MATH
pass@1 ยท MATH-500 subset
97.3%
๐Ÿ“„ DeepSeek-R1 paper (MATH-500)
Codeforces
percentile ยท 2,029 ELO equivalent
96.3percentile
๐Ÿ“„ DeepSeek-R1 paper
MMLU
accuracy ยท pass@1
90.8%
๐Ÿ“„ DeepSeek-R1 paper
GPQA
pass@1 ยท GPQA Diamond
71.5%
๐Ÿ“„ DeepSeek-R1 paper
LiveCodeBench
pass@1 ยท COT@8
65.9%
๐Ÿ“„ DeepSeek-R1 paper
SWE-bench
resolved ยท SWE-bench Verified
49.2%
๐Ÿ“„ DeepSeek-R1 paper
MMLU-Pro
EM ยท Exact Match
84.0%
๐Ÿ“„ DeepSeek-R1 paper

Deployment and security