product updateOpenAI

OpenAI launches GPT-Realtime-2 with GPT-5-class reasoning, adds real-time translation across 70 languages

TL;DR

OpenAI has added three voice intelligence features to its Realtime API: GPT-Realtime-2 with GPT-5-class reasoning for complex conversational requests, GPT-Realtime-Translate supporting 70 input languages and 13 output languages, and GPT-Realtime-Whisper for live speech-to-text transcription. Translation and transcription are billed by the minute, while GPT-Realtime-2 uses token-based pricing.

2 min read
1

OpenAI launches GPT-Realtime-2 with GPT-5-class reasoning, adds real-time translation across 70 languages

OpenAI has released three new voice intelligence features in its Realtime API: GPT-Realtime-2, a voice model with GPT-5-class reasoning; GPT-Realtime-Translate for real-time conversational translation; and GPT-Realtime-Whisper for live transcription.

GPT-Realtime-2: Voice with advanced reasoning

GPT-Realtime-2 succeeds GPT-Realtime-1.5 and includes what OpenAI describes as "GPT-5-class reasoning" designed to handle complex user requests during voice conversations. The model creates realistic vocal simulations and can converse with users while applying advanced reasoning capabilities. Pricing is token-based, though specific rates were not disclosed.

GPT-Realtime-Translate: 70 input languages, 13 output languages

The translation feature supports 70 input languages (languages it can understand) and 13 output languages (languages it can speak). According to OpenAI, the system provides real-time translation that "keeps pace" with conversational flow. The feature is billed by the minute, with pricing not yet disclosed.

GPT-Realtime-Whisper: Live transcription

GPT-Realtime-Whisper adds live speech-to-text capabilities, capturing transcriptions as conversations occur. Like the translation feature, it is billed by the minute.

Target use cases and safeguards

OpenAI positions these features for customer service, education, media, events, and creator platforms. The company stated it has implemented guardrails to prevent misuse for spam, fraud, or abuse. Conversations can be automatically halted if they violate OpenAI's harmful content guidelines, though the company did not specify how these triggers operate.

According to OpenAI, the new models move real-time audio "from simple call-and-response toward voice interfaces that can actually do work: listen, reason, translate, transcribe, and take action as a conversation unfolds."

What this means

The addition of GPT-5-class reasoning to voice models marks a capability upgrade beyond the previous generation, though OpenAI has not released GPT-5 itself or clarified what "GPT-5-class" specifically means in terms of benchmark performance. The 70-language translation support is substantial for multilingual applications, but the 13-output language limitation means many users will be able to understand the system but not receive responses in their native language. The per-minute billing for translation and transcription differs from the token-based model used for GPT-Realtime-2, which may affect cost predictability for developers building conversational applications.

Related Articles

product update

U.S. government orders Anthropic to halt exports of Mythos and Fable AI models, both now offline for one week

The White House ordered Anthropic to restrict exports of its Mythos and Fable AI models last Friday, citing national security concerns. Anthropic pulled both models offline within 90 minutes of the Commerce Department directive, marking the first major test of AI export controls.

product update

GitHub details Qubot, internal Copilot-powered data analytics agent for plain language queries

GitHub has released technical details on Qubot, an internal analytics agent powered by GitHub Copilot that enables employees to query company data using natural language. The agent represents GitHub's implementation of AI-assisted data analysis for internal operations.

product update

GitHub built Qubot, an internal data analytics agent using Copilot to query company data in natural language

GitHub has built Qubot, an internal analytics agent powered by GitHub Copilot that allows employees to query company data using natural language. The project represents GitHub's approach to building domain-specific AI agents for data analysis tasks.

product update

AWS launches Web Search on Amazon Bedrock AgentCore with tens of billions of documents, no external API required

Amazon Web Services launched Web Search on Amazon Bedrock AgentCore, a fully managed web search capability that gives AI agents access to tens of billions of documents without requiring external search APIs. The service, now generally available, runs entirely within AWS infrastructure and refreshes its index within minutes of new content appearing online.

Comments

Loading...