LLM News

Every LLM release, update, and milestone.

0
product update

Google announces Spark AI agent, Information agents, and Android Halo at I/O 2026—all paywalled behind $100/month Ultra

Google announced multiple AI agent products at I/O 2026, including Spark for managing digital tasks, Information agents for 24/7 topic monitoring, and Android Halo for notifications. All features remain paywalled behind the $100/month Gemini Ultra plan, with free access timeline unspecified.

0
analysisOpenAI

OpenAI reasoning model solves 80-year math problem as Anthropic hits $10.9B quarterly revenue

In a two-hour span Wednesday, OpenAI announced its reasoning model autonomously solved an 80-year-old geometry problem while Anthropic reported it's on track for $10.9 billion in Q2 revenue with $559 million in operating profit—two years ahead of internal projections. The developments came alongside Nvidia's $81.6 billion quarter, Anthropic's $1.25 billion monthly SpaceX compute deal, and a White House AI executive order signing.

2 min readvia axios.com
0
model releaseCohere

Cohere Releases Command A+: 218B-Parameter MoE Model With 4-Bit Quantization Runs on Single B200 GPU

Cohere has released Command A+, an open-source sparse mixture-of-experts model with 218 billion total parameters and 25 billion active parameters. The model features W4A4 quantization allowing deployment on a single Nvidia B200 GPU, supports 128K input context, and includes built-in chain-of-thought reasoning with vision capabilities.

0
researchOpenAI

OpenAI claims reasoning model disproved 80-year-old Erdős conjecture in geometry

OpenAI claims its new reasoning model has produced an original mathematical proof disproving a geometry conjecture first posed by Paul Erdős in 1946. The company says this is the first time AI has autonomously solved a prominent open problem central to a field of mathematics, with verification from mathematicians including Thomas Bloom and Noga Alon.

0
product update

AWS releases four multimodal evaluators for image-to-text AI tasks in Strands Evals SDK

AWS has added four multimodal evaluators to its Strands Evals SDK that judge image-to-text AI outputs by directly analyzing source images. The evaluators—Overall Quality, Correctness, Faithfulness, and Instruction Following—use multimodal large language models to detect visual hallucinations, factual errors, and instruction violations that text-only judges miss.

0
product updateAmazon Web Services

AWS SageMaker AI adds bidirectional streaming for real-time speech transcription with vLLM

Amazon SageMaker AI has launched bidirectional streaming support for real-time inference, enabling WebSocket-based voice applications through vLLM integration. The feature uses HTTP/2 on port 8443 to bridge client connections with vLLM's Realtime API, allowing audio to stream in while transcription streams back simultaneously over a single persistent connection.

2 min readvia aws.amazon.com
0
product update

Google launches Universal Cart, an AI agent that shops across multiple retailers in one checkout

Google announced Universal Cart at its I/O developer conference, an AI-powered shopping system that consolidates purchases from multiple retailers including Target, Shopify, Wayfair, and Etsy into a single checkout. The feature uses Gemini's agentic AI to verify product compatibility, suggest better deals, and automate routine purchases.

2 min readvia zdnet.com
0
analysis

Google bets Gemini Spark and 3.5 Flash can catch OpenClaw's agentic AI success

Google announced Gemini Spark, a cloud-based AI agent that runs 24/7 across Gmail, Drive, and 30+ external partners, powered by the upcoming Gemini 3.5 Flash model. The company claims the new model is four times faster and costs less than half of competing frontier models, directly responding to OpenClaw's viral success since November 2025.

2 min readvia theverge.com
0
model release

Google releases Gemini Omni Flash video generation model with conversational editing, withholds speech synthesis

Google DeepMind released Gemini Omni Flash, the first model in its new Omni family that generates and edits video from image, audio, video, and text inputs. The model is rolling out to Gemini app subscribers and YouTube Shorts with a 10-second clip limit, while speech-editing capabilities remain withheld pending safety testing.

0
model release

NemoStation releases Marlin-2B: 2-billion parameter video VLM achieves dense captioning performance between Tarsier-34B

NemoStation has released Marlin-2B, a 2-billion parameter video vision-language model that produces structured scene and event captions with second-precise timestamps. The model tops the CaReBench dense captioning leaderboard and sits between Tarsier-34B and Gemini-1.5-Pro on DREAM-1K, while matching Gemini-2.0-Flash on temporal grounding benchmarks.

2 min readvia huggingface.co
0
product updateOpenAI

OpenAI adopts C2PA metadata standard and Google's SynthID watermarking for AI image detection

OpenAI is joining the C2PA open standard and embedding Google DeepMind's invisible SynthID watermark in all AI-generated images from its models. The company is launching a public verification tool that checks for both C2PA metadata and SynthID watermarks, though detection only works for images created by OpenAI's own products.

2 min readvia thenextweb.com