model release

Google releases Nano Banana Pro image generation model with 2K/4K output and five-subject identity preservation

TL;DR

Google has released Nano Banana Pro, an advanced image generation and editing model built on Gemini 3 Pro. The model supports 2K/4K output resolution, preserves identity across up to five subjects, and includes real-time Search grounding for context-rich visual synthesis.

June 18, 2026 · 4:20 AM2 min read

Nano Banana Pro (Gemini 3 Pro Image) — Quick Specs

Context window66K tokens

Input$2/1M tokens

Output$12/1M tokens

Compare Nano Banana Pro (Gemini 3 Pro Image) with other models →

Google releases Nano Banana Pro image generation model with 2K/4K output and five-subject identity preservation

Google has released Nano Banana Pro, an image generation and editing model built on Gemini 3 Pro. According to Google, the model extends the original Nano Banana with improved multimodal reasoning, real-world grounding, and high-fidelity visual synthesis.

Technical specifications

Nano Banana Pro operates with a 66,000 token context window and is priced at $2 per 1M input tokens and $12 per 1M output tokens. The model is available through OpenRouter as of June 18, 2026.

The model supports 2K and 4K output resolutions with flexible aspect ratios. According to Google, it can preserve identity across up to five subjects in a single generation and handles consistent multi-image blending.

Key capabilities

Google claims the model offers "industry-leading text rendering in images" including long passages and multilingual layouts. The system integrates real-time information through Search grounding, allowing it to incorporate current data into generated images.

The model provides localized editing controls, including lighting adjustments, focus modifications, and camera transformations. According to Google, it generates context-rich graphics ranging from infographics and diagrams to cinematic composites.

Target use cases

Google positions Nano Banana Pro for professional-grade design workflows including product visualization, storyboarding, and complex multi-element compositions. The company claims the model maintains efficiency for general image creation despite its advanced capabilities.

The model is currently available through OpenRouter's API, which maintains OpenAI compatibility. Developers can access the model using the slug google/gemini-3-pro-image.

What this means

Nano Banana Pro's five-subject identity preservation and 4K output resolution represent notable technical specifications in the image generation space, though independent verification of Google's performance claims is not yet available. The $12 per 1M output token pricing positions it in the premium tier of image generation models, suggesting Google is targeting professional and enterprise use cases rather than consumer applications. The real-time Search grounding capability, if it performs as described, could differentiate it for applications requiring current information integration.

Source: openrouter.ai ↗

Google Gemini 3 Pro image generation multimodal Nano Banana Pro generative AI

model releaseJuly 29, 2026

Google Launches Lyria 3.5 Music Model With Section-Level Editing, No Full Regeneration Required

Google released Lyria 3.5, a music generation model that lets users edit individual sections of a track—vocals, drums, bass—without regenerating the whole song. The model is available now through Google Flow Music and produces tracks from 30 seconds to 3 minutes.

model releaseJuly 29, 2026

Microsoft Releases Mage-VL, a 4B-Parameter Codec-Native Streaming Vision-Language Model

Microsoft has released Mage-VL, a codec-native multimodal foundation model built on a from-scratch 4B-parameter visual encoder paired with Qwen3-4B-Instruct-2507. The model claims up to 3.5x inference speedup over uniform frame sampling and outperforms Qwen3-VL-4B on video and temporal-grounding benchmarks, according to Microsoft.

model releaseJuly 29, 2026

OpenAI's GPT Transcribe Cuts Word Error Rate to 3.31% but Trails ElevenLabs, Google, and Mistral

OpenAI released GPT Transcribe and GPT Live Transcribe, improving word error rate to 3.31 percent and cutting prices 25 percent to $0.0045 per minute. Independent benchmarks still place OpenAI behind ElevenLabs, Google, and Mistral on transcription accuracy.

model releaseJuly 29, 2026

Unsloth Releases GGUF Quantizations of Kimi K3, a 2.8T-Parameter Open-Weight MoE Model

Unsloth has released GGUF quantizations of Kimi K3, a 2.8-trillion-parameter open-weight Mixture-of-Experts model from Moonshot AI with a 1-million-token context window and native vision support. The largest lossless quantization (Q8) weighs in at 1.56TB.

Google releases Nano Banana Pro image generation model with 2K/4K output and five-subject identity preservation

Nano Banana Pro (Gemini 3 Pro Image) — Quick Specs

Google releases Nano Banana Pro image generation model with 2K/4K output and five-subject identity preservation

Technical specifications

Key capabilities

Target use cases

What this means

Related Articles

Google Launches Lyria 3.5 Music Model With Section-Level Editing, No Full Regeneration Required

Microsoft Releases Mage-VL, a 4B-Parameter Codec-Native Streaming Vision-Language Model

OpenAI's GPT Transcribe Cuts Word Error Rate to 3.31% but Trails ElevenLabs, Google, and Mistral

Unsloth Releases GGUF Quantizations of Kimi K3, a 2.8T-Parameter Open-Weight MoE Model

Comments