model release

Google releases Gemini 3.1 Flash Image, claims Pro-level quality at $0.50 per 1M tokens

TL;DR

Google has released Gemini 3.1 Flash Image, internally codenamed "Nano Banana 2," an image generation and editing model with a 131K context window. The model is priced at $0.50 per 1M input tokens and $3 per 1M output tokens.

June 18, 2026 · 4:21 AM2 min read

Gemini 3.1 Flash Image — Quick Specs

Context window131K tokens

Input$0.5/1M tokens

Output$3/1M tokens

Compare Gemini 3.1 Flash Image with other models →

Google releases Gemini 3.1 Flash Image, claims Pro-level quality at $0.50 per 1M tokens

Google has released Gemini 3.1 Flash Image, internally codenamed "Nano Banana 2," an image generation and editing model with a 131,000 token context window. The model is priced at $0.50 per million input tokens and $3 per million output tokens.

According to Google, the model delivers "Pro-level visual quality at Flash speed," positioning it as a faster, more cost-efficient alternative to its premium image models. The company claims the model combines advanced contextual understanding with fast inference, making complex image generation and iterative edits more accessible.

Technical specifications

Gemini 3.1 Flash Image supports:

Context window: 131,000 tokens
Multimodal input/output (image generation and editing)
Configurable aspect ratios via the image_config API parameter
Released: June 18, 2026

The model is available through OpenRouter, which routes requests across multiple hosting providers based on performance and pricing optimization.

Pricing comparison

At $0.50 per 1M input tokens and $3 per 1M output tokens, Gemini 3.1 Flash Image is positioned in the mid-tier pricing range for image generation models. OpenRouter reports that effective pricing can be 60-80% lower when prompt caching is applied for repeated context.

The pricing structure suggests Google is targeting production workloads that require both quality and cost efficiency, particularly for applications involving iterative image editing where context reuse is common.

What this means

Gemini 3.1 Flash Image represents Google's push into faster, more affordable image generation without sacrificing quality claims. The 131K context window is notably large for an image model, potentially enabling more complex multi-turn editing workflows. However, Google has not released benchmark comparisons against competing image models like DALL-E 3, Midjourney, or Stable Diffusion variants, making it difficult to independently verify the "Pro-level quality" claim. The model's real-world performance and adoption will depend on how it stacks up in user testing against established alternatives.

Source: openrouter.ai ↗

Google Gemini image-generation multimodal model-release

model releaseJuly 29, 2026

Google Launches Lyria 3.5 Music Model With Section-Level Editing, No Full Regeneration Required

Google released Lyria 3.5, a music generation model that lets users edit individual sections of a track—vocals, drums, bass—without regenerating the whole song. The model is available now through Google Flow Music and produces tracks from 30 seconds to 3 minutes.

model releaseJuly 29, 2026

Microsoft Releases Mage-VL, a 4B-Parameter Codec-Native Streaming Vision-Language Model

Microsoft has released Mage-VL, a codec-native multimodal foundation model built on a from-scratch 4B-parameter visual encoder paired with Qwen3-4B-Instruct-2507. The model claims up to 3.5x inference speedup over uniform frame sampling and outperforms Qwen3-VL-4B on video and temporal-grounding benchmarks, according to Microsoft.

model releaseJuly 29, 2026

OpenAI's GPT Transcribe Cuts Word Error Rate to 3.31% but Trails ElevenLabs, Google, and Mistral

OpenAI released GPT Transcribe and GPT Live Transcribe, improving word error rate to 3.31 percent and cutting prices 25 percent to $0.0045 per minute. Independent benchmarks still place OpenAI behind ElevenLabs, Google, and Mistral on transcription accuracy.

model releaseJuly 29, 2026

Unsloth Releases GGUF Quantizations of Kimi K3, a 2.8T-Parameter Open-Weight MoE Model

Unsloth has released GGUF quantizations of Kimi K3, a 2.8-trillion-parameter open-weight Mixture-of-Experts model from Moonshot AI with a 1-million-token context window and native vision support. The largest lossless quantization (Q8) weighs in at 1.56TB.

Google releases Gemini 3.1 Flash Image, claims Pro-level quality at $0.50 per 1M tokens

Gemini 3.1 Flash Image — Quick Specs

Google releases Gemini 3.1 Flash Image, claims Pro-level quality at $0.50 per 1M tokens

Technical specifications

Pricing comparison

What this means

Related Articles

Google Launches Lyria 3.5 Music Model With Section-Level Editing, No Full Regeneration Required

Microsoft Releases Mage-VL, a 4B-Parameter Codec-Native Streaming Vision-Language Model

OpenAI's GPT Transcribe Cuts Word Error Rate to 3.31% but Trails ElevenLabs, Google, and Mistral

Unsloth Releases GGUF Quantizations of Kimi K3, a 2.8T-Parameter Open-Weight MoE Model

Comments