Google releases Gemini 3.1 Flash Image, claims Pro-level quality at $0.50 per 1M tokens
Google has released Gemini 3.1 Flash Image, internally codenamed "Nano Banana 2," an image generation and editing model with a 131K context window. The model is priced at $0.50 per 1M input tokens and $3 per 1M output tokens.
Gemini 3.1 Flash Image — Quick Specs
Google releases Gemini 3.1 Flash Image, claims Pro-level quality at $0.50 per 1M tokens
Google has released Gemini 3.1 Flash Image, internally codenamed "Nano Banana 2," an image generation and editing model with a 131,000 token context window. The model is priced at $0.50 per million input tokens and $3 per million output tokens.
According to Google, the model delivers "Pro-level visual quality at Flash speed," positioning it as a faster, more cost-efficient alternative to its premium image models. The company claims the model combines advanced contextual understanding with fast inference, making complex image generation and iterative edits more accessible.
Technical specifications
Gemini 3.1 Flash Image supports:
- Context window: 131,000 tokens
- Multimodal input/output (image generation and editing)
- Configurable aspect ratios via the
image_configAPI parameter - Released: June 18, 2026
The model is available through OpenRouter, which routes requests across multiple hosting providers based on performance and pricing optimization.
Pricing comparison
At $0.50 per 1M input tokens and $3 per 1M output tokens, Gemini 3.1 Flash Image is positioned in the mid-tier pricing range for image generation models. OpenRouter reports that effective pricing can be 60-80% lower when prompt caching is applied for repeated context.
The pricing structure suggests Google is targeting production workloads that require both quality and cost efficiency, particularly for applications involving iterative image editing where context reuse is common.
What this means
Gemini 3.1 Flash Image represents Google's push into faster, more affordable image generation without sacrificing quality claims. The 131K context window is notably large for an image model, potentially enabling more complex multi-turn editing workflows. However, Google has not released benchmark comparisons against competing image models like DALL-E 3, Midjourney, or Stable Diffusion variants, making it difficult to independently verify the "Pro-level quality" claim. The model's real-world performance and adoption will depend on how it stacks up in user testing against established alternatives.
Related Articles
Google releases Nano Banana Pro image generation model with 2K/4K output and five-subject identity preservation
Google has released Nano Banana Pro, an advanced image generation and editing model built on Gemini 3 Pro. The model supports 2K/4K output resolution, preserves identity across up to five subjects, and includes real-time Search grounding for context-rich visual synthesis.
MiniMax Releases M3: 428B-Parameter Multimodal Model with 1M Context Window and 15× Decode Speedup
MiniMax has released M3, a multimodal model with approximately 428 billion parameters and 23 billion activated parameters. The model supports a 1 million token context window and uses MiniMax Sparse Attention to achieve 9× prefill and 15× decode speedups compared to its predecessor M2.
Anthropic releases Fable 5, bringing capabilities of restricted Mythos model to public with $10/$50 per 1M token pricing
Anthropic has released Fable 5, making capabilities from its previously restricted Mythos model available to the public. The company claims Fable 5 beats GPT-5.5, Gemini 3.1 Pro, and its own Opus 4.8 in internal testing, with pricing set at $10 per million input tokens and $50 per million output tokens after a free trial period ending June 22.
NVIDIA Releases Quantized DiffusionGemma 26B: 1,100+ Tokens/Second with 256K Context Window
NVIDIA released a quantized version of Google DeepMind's DiffusionGemma 26B A4B IT, a multimodal model with 25.2B total parameters (3.8B active) that processes text, image, and video inputs. The NVFP4-quantized model achieves generation speeds exceeding 1,100 tokens per second on NVIDIA H100 GPUs while supporting a 256K token context window.
Comments
Loading...