Google releases Gemini 3.5 Flash with autonomous coding and agent capabilities, claims 4x speed boost
Google released Gemini 3.5 Flash, positioning it as an agent-first model designed for autonomous coding and multi-hour workflows. The company claims the model outperforms its 3.1 Pro predecessor on coding and agentic benchmarks while running 4x faster than competing frontier models, with an optimized version achieving 12x speed gains.
Gemini 3.5 Flash — Quick Specs
Google releases Gemini 3.5 Flash with autonomous coding and agent capabilities, claims 4x speed boost
Google released Gemini 3.5 Flash at its I/O developer conference on Tuesday, positioning the model as purpose-built for autonomous agents rather than conversational AI. According to the company, the model can independently execute coding pipelines, manage research projects, and in internal demonstrations, built an operating system from scratch.
Performance claims
"It outperforms our latest frontier model, 3.1 Pro, on nearly all the benchmarks," said Koray Kavukcuoglu, DeepMind's chief technologist, including coding, agentic tasks, and multimodal reasoning. Google claims the model runs 4x faster than other frontier models, with an optimized version achieving 12x speed improvements "with the same quality."
Specific benchmark scores were not disclosed. Pricing details were not announced.
Agent-first architecture
The speed improvements target multi-agent workflows where multiple AI instances run simultaneously on long-duration tasks. Google demonstrated the model at I/O spawning separate agents to build components of a full operating system inside Antigravity, the company's agent development platform.
Kavukcuoglu said Flash 3.5 was co-developed with Antigravity to create a "native environment where they can live, work, and execute." Google released Antigravity 2.0 alongside the model, describing it as a standalone desktop IDE designed for agent-first development.
The model can run autonomously for multiple hours. Tulsee Doshi, Google's senior director and head of product, said it pauses to request user input at decision points requiring human judgment.
Multi-model orchestration
Google plans to release Gemini 3.5 Pro, which will work in tandem with Flash. "3.5 Pro becomes your orchestrator, your planner, and then it actually can leverage Flash to be the various sub-agents," Doshi said. The Pro model will handle reasoning-intensive tasks while Flash executes tool use and sub-tasks.
Deployment and access
Gemini 3.5 Flash is now the default model in the Gemini app and AI Mode in Search globally. It's available through Antigravity, the Gemini API, and Gemini Enterprise. The model will also power Gemini Spark, Google's 24/7 personal AI agent for consumer digital life management.
Google announced agentic capabilities coming to Search, allowing users to create and manage AI agents directly on the platform.
Safety considerations
The deployment comes as Google faces a lawsuit after a user died by suicide following weeks of interactions with Gemini in 2025. Google says Gemini 3.5 includes strengthened cyber and CBRN (Chemical, Biological, Radiological, and Nuclear) safeguards and is "better calibrated to engage with sensitive questions rather than refuse them outright."
What this means
Google is repositioning its AI strategy around autonomous agents capable of multi-hour workflows rather than single-turn conversations. The speed claims—if validated—would represent a significant advantage for agentic applications where latency compounds across multiple model calls. However, without disclosed benchmarks, context window size, or pricing, developers lack the data needed to evaluate the model against Anthropic's Claude 3.5 Sonnet or OpenAI's o1 for similar use cases. The co-development of Flash with Antigravity suggests Google is betting on vertical integration between model and tooling as a competitive moat.
Related Articles
Anthropic releases Claude Sonnet 5 at $2/1M input tokens, 63.2% agentic coding benchmark
Anthropic has released Claude Sonnet 5, its new mid-tier model optimized for agentic tasks, priced at $2 per million input tokens through August 31 before rising to $3/1M. The model scores 63.2% on agentic coding benchmarks, approaching Opus 4.8's 69.2% performance at a significantly lower price point.
Google releases Gemini 3.1 Flash Lite Image, its fastest and cheapest image generation model
Google has released Gemini 3.1 Flash Lite Image, also called Nano Banana 2 Lite, which the company describes as its fastest and cheapest image generation model. The model is available through Google's AI Studio and Gemini API with the identifier gemini-3.1-flash-lite-image.
Mistral Releases Leanstral 1.5: 6B-Parameter Model Achieves 100% on miniF2F, Solves 587/672 PutnamBench Problems
Mistral AI released Leanstral 1.5, a free Apache-2.0 licensed model with 119B total parameters and 6B active parameters specialized for formal verification in Lean 4. The model achieves 100% on miniF2F benchmark, solves 587 of 672 PutnamBench problems at $4 per problem (versus $300+ for competitors), and reaches state-of-the-art 87% on FATE-H and 34% on FATE-X benchmarks.
Claude Sonnet 5 ships with 1M token context and new tokenizer that increases costs 30-40% for English text
Anthropic released Claude Sonnet 5 with a 1 million token context window and 128,000 token maximum output. The model removes traditional sampling parameters and introduces a new tokenizer that generates approximately 30% more tokens than Sonnet 4.6 for the same English text—effectively a significant price increase despite unchanged nominal rates of $3/million input and $15/million output tokens.
Comments
Loading...