Gemma 4 12B Unified

Name: Gemma 4 12B Unified
Author: Google DeepMind

Google DeepMind🇺🇸 United States

active

Compare with other models →

Context window256K tokens

Version History

4majorJune 3, 2026

Gemma 4 introduces encoder-free architecture in the 12B Unified model, processing all modalities directly through a single decoder-only transformer. The family spans five models from 2.3B to 30.7B parameters with extended context windows up to 256K tokens.

Benchmark Scores

Full leaderboard →

77.5%

AIME 2024

78.8%

GPQA

72.0%

LiveCodeBench

77.2%

MMLU-Pro

69.1%

MMMU

Coverage

model releaseGoogle DeepMind

Google DeepMind Releases Gemma 4: Encoder-Free Multimodal Models from 2.3B to 30.7B Parameters

Google DeepMind released Gemma 4, a family of open-weight multimodal models ranging from 2.3B to 30.7B parameters. The flagship 12B Unified model eliminates separate encoders, processing text, images, audio, and video directly through a single decoder-only transformer with up to 256K token context window.

June 3, 2026 · 8:51 PM2 min read

google-deepmind gemma-4 multimodal

model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 12B Unified: encoder-free multimodal model with 256K context window

Google DeepMind has released Gemma 4 12B Unified, an encoder-free multimodal model that processes text, images, and audio through a single decoder-only transformer. The model features 11.95 billion parameters, a 256K token context window, and achieves 77.2% on MMLU Pro and 72.0% on LiveCodeBench v6.

June 3, 2026 · 5:51 PM3 min read

google-deepmind gemma-4 multimodal