audio-ai

5 articles tagged with audio-ai

April 7, 2026

Amazon Nova 2 Sonic enables real-time AI podcast generation with 1M token context

Amazon has published a technical guide for building real-time conversational podcasts using Amazon Nova 2 Sonic, its speech understanding and generation model. The solution demonstrates streaming audio generation, multi-turn dialogue between AI hosts, and stage-aware content filtering through a web interface.

April 7, 2026 · 4:35 PM

March 30, 2026

model release

Google releases Lyria 3 Clip Preview for music generation via API

Google has released Lyria 3 Clip Preview, a music generation model available through the Gemini API as of March 30, 2026. The model generates 30-second audio clips from text prompts or images at $0.04 per clip, with a 1,048,576 token context window.

March 30, 2026 · 11:35 PM

March 26, 2026

model release

Google releases Gemini 3.1 Flash Live, its highest-quality audio model for real-time voice AI

Google has released Gemini 3.1 Flash Live, its highest-quality audio model designed for natural and reliable real-time voice interactions. The model scores 90.8% on ComplexFuncBench Audio and 36.1% on Scale AI's Audio MultiChallenge with thinking enabled. It's now available to developers via the Gemini Live API, enterprises through Gemini Enterprise for Customer Experience, and consumers in Search Live and Gemini Live across 200+ countries.

March 26, 2026 · 3:36 PM

model release

Google releases Gemini 3.1 Flash Live, its highest-quality audio model for real-time voice AI

Google has released Gemini 3.1 Flash Live, its highest-quality audio and voice model designed for real-time dialogue. The model scores 90.8% on ComplexFuncBench Audio and 36.1% on Scale AI's Audio MultiChallenge with reasoning enabled, with improved tonal understanding and lower latency compared to previous versions.

March 26, 2026 · 3:35 PM

March 1, 2026

benchmark

ElevenLabs and Google lead Artificial Analysis speech-to-text benchmark

Artificial Analysis has released an updated speech-to-text benchmark showing ElevenLabs and Google as top performers. The benchmark provides comparative analysis of current speech recognition systems across multiple models.

March 1, 2026 · 3:05 PM

← Back to all news