Breaking

Amazon Polly adds bidirectional streaming API for real-time speech synthesis in conversational AI

Amazon has released a new Bidirectional Streaming API for Amazon Polly that enables simultaneous text input and audio output over a single HTTP/2 connection. The API reduces end-to-end latency by 39% compared to traditional request-response TTS by allowing text to be sent word-by-word as LLMs generate tokens, rather than waiting for complete sentences. The feature is available in Java, JavaScript, .NET, C++, Go, Kotlin, PHP, Ruby, Rust, and Swift SDKs.

March 26, 2026

Latest News

All news →
0
product update

Google launches Search Live globally with real-time camera and voice search

Google is expanding Search Live globally to users in more than 200 countries, enabling real-time voice and camera search through the Google app and Lens. The feature, powered by Gemini 3.1 Flash Live—a new multilingual audio and video model—allows users to point their phone camera at objects and ask questions with instant spoken responses.

0
model release

Google releases Gemini 3.1 Flash Live, its highest-quality audio model for real-time voice AI

Google has released Gemini 3.1 Flash Live, its highest-quality audio model designed for natural and reliable real-time voice interactions. The model scores 90.8% on ComplexFuncBench Audio and 36.1% on Scale AI's Audio MultiChallenge with thinking enabled. It's now available to developers via the Gemini Live API, enterprises through Gemini Enterprise for Customer Experience, and consumers in Search Live and Gemini Live across 200+ countries.

2 min readvia deepmind.google
0
product updateByteDance

ByteDance rolls out Dreamina Seedance 2.0 video generation to CapCut with IP safeguards

ByteDance confirmed Thursday that Dreamina Seedance 2.0, its audio and video generation model, is rolling out in CapCut across seven initial markets. The model generates videos up to 15 seconds with realistic textures and motion, but includes safety restrictions blocking generation from real faces and unauthorized IP use.

2 min readvia techcrunch.com
0
model release

Google releases Gemini 3.1 Flash Live, its highest-quality audio model for real-time voice AI

Google has released Gemini 3.1 Flash Live, its highest-quality audio and voice model designed for real-time dialogue. The model scores 90.8% on ComplexFuncBench Audio and 36.1% on Scale AI's Audio MultiChallenge with reasoning enabled, with improved tonal understanding and lower latency compared to previous versions.

2 min readvia blog.google
0
product updateGitHub

GitHub will train Copilot models on user interaction data starting April 2026

GitHub will use Copilot interaction data from Free, Pro, and Pro+ plan users to train AI models starting April 24, 2026, unless users actively opt out. The policy does not affect Copilot Business and Enterprise customers. Data shared will include prompts, outputs, code snippets, filenames, and repository structures.

2 min readvia the-decoder.com
0
research

Google's TurboQuant compression cuts LLM memory needs by 6x, sparks memory chip stock selloff

Google unveiled TurboQuant, a compression technique that reduces memory required to run large language models by six times by optimizing key-value cache storage. Memory chipmakers Samsung, SK Hynix, and Micron fell 5-6% on concern the efficiency breakthrough could reduce future chip demand. Analysts expect the decline reflects profit-taking rather than a fundamental shift, as more powerful models will eventually require more advanced hardware.

0
benchmarkOpenAI

ARC-AGI-3 benchmark: frontier AI models score below 1%, humans solve all 135 tasks

The ARC Prize Foundation released ARC-AGI-3, an interactive benchmark requiring AI agents to explore environments, form hypotheses, and execute plans without instructions. All 135 environments were solved by untrained humans, yet frontier models—including Gemini 3.1 Pro Preview (0.37%), GPT 5.4 (0.26%), Opus 4.6 (0.25%), and Grok-4.20 (0.00%)—scored below 1%.

0
researchApple

Apple's RubiCap model generates better image captions with 3-7B parameters than 72B competitors

Apple researchers developed RubiCap, a framework for training dense image captioning models that achieve state-of-the-art results at 2B, 3B, and 7B parameter scales. The 7B model outperforms models up to 72 billion parameters on multiple benchmarks including CapArena and CaptionQA, while the 3B variant matches larger 32B models, suggesting efficient dense captioning doesn't require massive scale.

2 min readvia 9to5mac.com
0
research

Google's TurboQuant cuts AI inference memory by 6x using lossless compression

Google Research unveiled TurboQuant, a lossless memory compression algorithm that reduces AI inference working memory (KV cache) by at least 6x without impacting model performance. The technology uses vector quantization methods called PolarQuant and an optimization technique called QJL. Findings will be presented at ICLR 2026.

Latest Models

All →