model release

Google Releases Gemini 3.5 Flash with 1M Token Context and Configurable Thinking Modes at $1.50/$9 Per Million Tokens

TL;DR

Google has released Gemini 3.5 Flash, a multimodal model with a 1 million token context window priced at $1.50 per million input tokens and $9 per million output tokens. The model supports text, image, video, audio, and PDF inputs with configurable thinking effort levels from minimal to high.

2 min read
0

Google Releases Gemini 3.5 Flash with 1M Context and Thinking Modes

Google has released Gemini 3.5 Flash, a multimodal model priced at $1.50 per million input tokens and $9 per million output tokens. The model features a 1 million token context window and supports text, image, video, audio, and PDF inputs.

Key Specifications

Gemini 3.5 Flash is positioned as a high-efficiency model delivering what Google describes as "near-Pro level coding and reasoning at Flash-tier cost and speed." The model defaults to medium thinking effort for standard responses but supports four configurable thinking levels: minimal, low, medium, and high.

The thinking mode configuration allows developers to make explicit cost-performance trade-offs based on task complexity. This feature is designed for parallel agentic execution loops where different subtasks may require different computational resources.

Technical Capabilities

According to Google, the model is "highly optimized for coding proficiency" and multimodal processing. The 1 million token context window positions it for handling large codebases, extensive documentation, and long-form content analysis.

The multimodal capabilities extend across five input types: text, static images, video, audio, and PDF documents. This broad input support makes the model applicable to document processing, multimedia analysis, and complex reasoning tasks that span multiple data formats.

Pricing and Availability

At $1.50 per million input tokens and $9 per million output tokens, Gemini 3.5 Flash is priced competitively in the Flash model tier. The model is available through OpenRouter with routing to multiple providers for reliability and uptime optimization.

The release date is listed as May 19, 2026 in the source material, though this appears to be a future date and may represent a placeholder or projected availability timeline.

What This Means

Gemini 3.5 Flash's configurable thinking modes represent a shift toward explicit computational trade-offs in model inference. Rather than offering a single performance point, developers can adjust reasoning depth based on task requirements—a feature particularly relevant for agentic workflows where some operations need deep reasoning while others prioritize speed.

The 1M context window combined with multimodal support and competitive pricing positions this model for code analysis, document processing, and complex multi-step reasoning tasks. The thinking mode feature may influence how other providers structure their model offerings, particularly for use cases requiring variable computational intensity across different parts of a workflow.

Related Articles

model release

Google launches Nano Banana 2 Lite image model at 4 seconds per image, $0.04 per 1,000 generations

Google released Nano Banana 2 Lite, an image generation model that produces images in four seconds at under four cents per thousand images. The model prioritizes speed and cost over quality, targeting developers building high-volume image pipelines.

model release

Google launches Gemini 3.1 Flash Lite Image with 4-second generation time, $0.25 per 1M input tokens

Google has released Gemini 3.1 Flash Lite Image, a text-to-image model that generates 1K resolution images in approximately 4 seconds — 2.7× faster than Gemini 3.1 Flash Image. The model is priced at $0.25 per 1M input tokens and $1.50 per 1M output tokens, with a 66K context window and knowledge cutoff of January 2025.

model release

Google releases Nano Banana 2 Lite: 4-second image generation at $0.034 per 1,000 images

Google released Nano Banana 2 Lite, an AI image generator that produces images in 4 seconds and costs $0.034 per 1,000 images. The model is optimized for high-volume workflows and replaces the original Nano Banana as Google's entry-level image generation offering.

model release

Google releases Gemini 3.1 Flash Lite Image, its fastest and cheapest image generation model

Google has released Gemini 3.1 Flash Lite Image, also called Nano Banana 2 Lite, which the company describes as its fastest and cheapest image generation model. The model is available through Google's AI Studio and Gemini API with the identifier gemini-3.1-flash-lite-image.

Comments

Loading...