model release

Google releases Nano Banana Pro image generation model with 2K/4K output and five-subject identity preservation

TL;DR

Google has released Nano Banana Pro, an advanced image generation and editing model built on Gemini 3 Pro. The model supports 2K/4K output resolution, preserves identity across up to five subjects, and includes real-time Search grounding for context-rich visual synthesis.

2 min read
0

Google releases Nano Banana Pro image generation model with 2K/4K output and five-subject identity preservation

Google has released Nano Banana Pro, an image generation and editing model built on Gemini 3 Pro. According to Google, the model extends the original Nano Banana with improved multimodal reasoning, real-world grounding, and high-fidelity visual synthesis.

Technical specifications

Nano Banana Pro operates with a 66,000 token context window and is priced at $2 per 1M input tokens and $12 per 1M output tokens. The model is available through OpenRouter as of June 18, 2026.

The model supports 2K and 4K output resolutions with flexible aspect ratios. According to Google, it can preserve identity across up to five subjects in a single generation and handles consistent multi-image blending.

Key capabilities

Google claims the model offers "industry-leading text rendering in images" including long passages and multilingual layouts. The system integrates real-time information through Search grounding, allowing it to incorporate current data into generated images.

The model provides localized editing controls, including lighting adjustments, focus modifications, and camera transformations. According to Google, it generates context-rich graphics ranging from infographics and diagrams to cinematic composites.

Target use cases

Google positions Nano Banana Pro for professional-grade design workflows including product visualization, storyboarding, and complex multi-element compositions. The company claims the model maintains efficiency for general image creation despite its advanced capabilities.

The model is currently available through OpenRouter's API, which maintains OpenAI compatibility. Developers can access the model using the slug google/gemini-3-pro-image.

What this means

Nano Banana Pro's five-subject identity preservation and 4K output resolution represent notable technical specifications in the image generation space, though independent verification of Google's performance claims is not yet available. The $12 per 1M output token pricing positions it in the premium tier of image generation models, suggesting Google is targeting professional and enterprise use cases rather than consumer applications. The real-time Search grounding capability, if it performs as described, could differentiate it for applications requiring current information integration.

Related Articles

model release

Google releases Gemini 3.1 Flash Image, claims Pro-level quality at $0.50 per 1M tokens

Google has released Gemini 3.1 Flash Image, internally codenamed "Nano Banana 2," an image generation and editing model with a 131K context window. The model is priced at $0.50 per 1M input tokens and $3 per 1M output tokens.

model release

NVIDIA Releases Quantized DiffusionGemma 26B: 1,100+ Tokens/Second with 256K Context Window

NVIDIA released a quantized version of Google DeepMind's DiffusionGemma 26B A4B IT, a multimodal model with 25.2B total parameters (3.8B active) that processes text, image, and video inputs. The NVFP4-quantized model achieves generation speeds exceeding 1,100 tokens per second on NVIDIA H100 GPUs while supporting a 256K token context window.

model release

Amazon Bedrock adds Gemma 4 models with 256K context and built-in reasoning mode

Amazon Web Services today announced availability of Google DeepMind's Gemma 4 family on Amazon Bedrock. The open-weight models include three instruction-tuned variants spanning 2.3B to 30.7B parameters, with 256K context windows, multimodal input support, and built-in reasoning mode.

model release

MiniMax Releases M3: 428B-Parameter Multimodal Model with 1M Context Window and 15× Decode Speedup

MiniMax has released M3, a multimodal model with approximately 428 billion parameters and 23 billion activated parameters. The model supports a 1 million token context window and uses MiniMax Sparse Attention to achieve 9× prefill and 15× decode speedups compared to its predecessor M2.

Comments

Loading...

Google Nano Banana Pro (Gemini 3 Pro Image) Model Released | TPS