OpenAI streaming speech-to-text model for low-latency realtime transcription, served via the Realtime transcription API.
Context window
16K tokens
tokens
Max output
2,000
tokens
Access:APIDeployment:☁ Cloud
Overview
Access & deployment
API
Cloud
Weights: Closed
Key parameters
📏 Context: 16K tokens
📥 Input: audio, text
Technical specification
Context window
16K tokens
tokens
Max output tokens
2,000
tokens per response
Knowledge cutoff
30 Sept 2024
Knowledge boundary
Modalities
⬇ Input
audiotext
⬆ Output
text
Capabilities and applications
Native model capabilities
Streaming Speech-to-Text
Real-time conversion of speech to text with immediate output as the speaker is talking.
Category: speech
Technical architecture
Core Architecture
