Robots Atlas>ROBOTS ATLAS
GPT-5.5

GPT-5.5

gpt-5.5 · Family: GPT
GPT-5.5 is OpenAI's newest frontier model, focused on agentic coding, computer use, knowledge work, and scientific research with a 1M-token context window.
✓ Active✓ Public accessLLMMultimodalReasoning modelTool-using model📁 GPT
Context window
1M
tokens
Max output
128,000
tokens
Release date
23 April 2026
Access:APIHostedDeployment:☁ Cloud

Overview

Classification
LLMMultimodalReasoning modelTool-using model
Family: GPT
Access & deployment
APIHosted
Cloud
Weights: Closed
Key parameters
📏 Context: 1M
Tools
📥 Input: text, image

Technical specification

Context window
1M
tokens
Max output tokens
128,000
tokens per response
Knowledge cutoff
1 Dec 2025
Knowledge boundary
Features:Tool use
Modalities
⬇ Input
textimage
⬆ Output
textcodestructured_data

Capabilities and applications

Native model capabilities
Reasoning
The model's ability to reason logically and solve complex problems.
Category: reasoning
Multi-step reasoning
Carrying out multi-step chains of reasoning across long, complex tasks.
Category: reasoning
Long context
Maintaining coherence and focus across very long input context.
Category: language
Coding
Generating, analysing and modifying source code.
Category: coding
Function Calling
Category: planning
Structured output
Producing data in structured formats such as JSON.
Category: structured_generation
Audio understanding
Category: audio
Image understanding
Analysing and interpreting the content of images.
Category: vision
Video Understanding
Category: video
Chart understanding
Reading and interpreting charts, tables and diagrams.
Category: vision
Diagram reasoning
Category: reasoning
OCR
Recognising text within images and documents.
Category: vision
Multilingual
Understanding and generating text in many languages.
Category: language
Planning
Forming and executing action plans for complex tasks.
Category: planning
Streaming output
Category: reasoning
Interleaved Multimodal Input
Category: reasoning
Multimodal understanding
Category: multimodal

Benchmark results

20 benchmarks
SWE-Bench Pro (Public)
accuracy · Evaluation conducted with xhigh reasoning effort in a research environment.
58.6%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Results may differ slightly from production ChatGPT.
Terminal-Bench 2.0
accuracy · Tests of complex command-line workflows requiring planning, iteration, and tool coordination.
82.7%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
Expert-SWE (Internal)
accuracy · Internal evaluation of long-term coding tasks (estimated human completion time: 20 hours).
73.1%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Internal OpenAI benchmark; no public methodology available.
OSWorld
accuracy · Measures the model's ability to independently operate real operating systems.
78.7%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
GDPval
wins or ties vs industry professional · Tests the model's ability to produce specialized professional knowledge across 44 occupations.
84.9%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
BrowseComp
accuracy · Evaluation of browser tool usage capabilities.
84.4%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
GPQA
accuracy · Evaluation with xhigh reasoning effort in a research environment.
93.6%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with reasoning effort xhigh in a research environment.
Humanity's Last Exam (HLE)
accuracy · Evaluation with reasoning effort set to xhigh in a research environment, without tools.
41.4%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Toolless variant.
Humanity's Last Exam (HLE)
accuracy · Evaluation with xhigh reasoning effort in a research environment, with tools.
52.2%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Variant with tools.
EpochAI Frontier Math
accuracy · Evaluation with xhigh reasoning effort in a research environment.
51.7%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
EpochAI Frontier Math
accuracy · Hardest FrontierMath tier; evaluation with reasoning effort xhigh.
35.4%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
Toolathlon
accuracy · Tool use evaluation; reasoning effort xhigh.
55.6%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with reasoning effort xhigh in a research environment.
CyberGym
accuracy · Cybersecurity benchmark; reasoning effort xhigh.
81.8%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation conducted with reasoning effort set to xhigh in a research environment.
TAU-bench
accuracy · Tests complex customer service workflows in telecommunications; results obtained without prompt tuning or prompt adjustments.
98.0%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Tau2-bench Telecom run without prompt tuning (GPT-4.1 as user model).
MMMU
accuracy · Multimodal evaluation without tools; reasoning effort xhigh.
81.2%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation conducted with reasoning effort set to xhigh in a research environment.
MMMU
accuracy · Multimodal evaluation with tools; reasoning effort xhigh.
83.2%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
BixBench
accuracy · Bioinformatics benchmark and data analysis; reasoning effort xhigh.
80.5%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
GeneBench
accuracy · Multi-step analysis of scientific data in genetics and quantitative biology; reasoning effort xhigh.
25.0%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
ARC-AGI-1 (Verified)
accuracy · Abstract reasoning; reasoning effort xhigh.
95.0%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with reasoning effort xhigh in a research environment.
ARC-AGI-2 (Verified)
accuracy · Abstract reasoning (harder difficulty level); reasoning effort xhigh.
85.0%
📅 23 Apr 2026📄 OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with reasoning effort xhigh in a research environment.

Pricing

Deployment and security

🔒 Security / Enterprise
✓ Verified enterprise information

OpenAI rates GPT-5.5's cyber and biological capabilities as High under the Preparedness Framework. The model underwent a full safety and governance process, including targeted evaluations for advanced cyber and biological capabilities and testing with external experts.

Updated: 25 Apr 2026↗ Security documentation