OpenAI releases GPT-Realtime-2 reasoning voice model with two specialized variants for translation and transcription
OpenAI has released three new realtime voice models through its Realtime API: GPT-Realtime-2 with GPT-5-class reasoning capabilities, GPT-Realtime-Translate supporting 70 input languages, and GPT-Realtime-Whisper for streaming transcription. The models are priced at $32-64 per 1M audio tokens for GPT-Realtime-2, and $0.017-0.034 per minute for the specialized variants.
GPT-Realtime-2 — Quick Specs
OpenAI releases GPT-Realtime-2 reasoning voice model with two specialized variants for translation and transcription
OpenAI has released three new realtime voice models through its Realtime API, with the flagship GPT-Realtime-2 incorporating what the company describes as "GPT-5-class reasoning" capabilities.
GPT-Realtime-2: Voice model with reasoning
GPT-Realtime-2 is designed for live voice interactions where the model maintains conversation flow while processing complex requests. According to OpenAI, the model can "reason through a request, call tools, handle corrections or interruptions, and respond in a way that fits the moment." The model is priced at $32 per 1M audio input tokens ($0.40 for cached input tokens) and $64 per 1M audio output tokens.
GPT-Realtime-Translate: Live speech translation
GPT-Realtime-Translate provides real-time speech translation from 70+ input languages into 13 output languages. The model is designed to maintain pace with the speaker during live translation. Pricing is set at $0.034 per minute.
GPT-Realtime-Whisper: Streaming transcription
GPT-Realtime-Whisper offers low-latency streaming transcription that processes speech as users speak. OpenAI positions this model for applications requiring live captions and real-time meeting notes. The model costs $0.017 per minute.
Availability and implementation
All three models are now available through OpenAI's Realtime API. Developers can test the models in OpenAI's Playground. The company has not disclosed specific benchmark scores, context window sizes, or training data cutoff dates for these models.
What this means
The release of GPT-Realtime-2 with claimed "GPT-5-class reasoning" represents OpenAI's first voice model with advanced reasoning capabilities, potentially enabling more sophisticated voice-based applications beyond simple command-and-response patterns. The specialized translation and transcription models address specific use cases with per-minute pricing that may be more predictable for developers building streaming applications. However, without published benchmarks or technical specifications, the actual performance improvements over existing voice models remain unclear.
Related Articles
OpenAI adds Trusted Contact feature to alert emergency contacts when ChatGPT detects self-harm discussions
OpenAI launched an optional Trusted Contact feature for ChatGPT that notifies designated emergency contacts when the system detects discussions about self-harm or suicide. The feature requires manual review by trained personnel before sending notifications, and does not share chat transcripts with contacts.
OpenAI releases GPT-5.5 Instant as default ChatGPT model with 52.5% fewer hallucinations
OpenAI released GPT-5.5 Instant as the new default ChatGPT model on May 5, 2026. The company claims the update produces 52.5% fewer hallucinations on high-stakes prompts and 37.3% fewer inaccurate claims on challenging conversations compared to GPT-5.3 Instant.
IBM Releases Granite Speech 4.1 2B: 2-Billion-Parameter Multilingual Speech Model with Non-Autoregressive Variant
IBM has released Granite Speech 4.1 2B, a 2-billion-parameter speech-language model trained on 174,000 hours of audio for automatic speech recognition and translation across English, French, German, Spanish, Portuguese, and Japanese. The model introduces a dual-head CTC encoder and includes variants for speaker attribution and a novel non-autoregressive architecture for higher throughput.
OpenAI launches Advanced Account Security for ChatGPT with mandatory passkeys and disabled AI training
OpenAI has released Advanced Account Security, an opt-in feature for ChatGPT users that requires passkey or physical security key authentication, automatically disables AI training on conversations, and implements shorter login sessions. The company partnered with Yubico to offer two YubiKeys for $68, nearly half the usual $126 price.
Comments
Loading...