DiffusionGemma 26B A4B IT

Name: DiffusionGemma 26B A4B IT
Author: Google DeepMind

Google DeepMind🇺🇸 United States

active

Compare with other models →

Version History

26B-A4B-itmajorJune 10, 2026

Initial release of DiffusionGemma, a discrete diffusion-based text generation model built on Gemma 4 26B A4B MoE architecture with encoder-decoder design for parallel token generation.

26B-A4B-ITmajorJune 10, 2026

Google releases DiffusionGemma 26B as open-weight model under Apache 2 license, bringing diffusion-based text generation to production with 500+ tokens/second inference speed.

Benchmark Scores

Full leaderboard →

69.1%

AIME 2026

73.2%

GPQA

77.6%

MMLU-Pro

500.0 tokens_per_sec

Speed (tok/s)

Coverage

model release

Google releases DiffusionGemma 26B, open-weight model generates 500+ tokens/second

Google has released DiffusionGemma 26B, an open-weight text generation model under Apache 2 license. The model generates over 500 tokens/second according to testing on NVIDIA's free NIM API, where it produced 2,409 tokens in 4.4 seconds.

June 10, 2026 · 8:20 PM1 min read

google gemma diffusion-models

model releaseGoogle DeepMind

Google DeepMind releases DiffusionGemma, a 26B parameter model generating 15-20 tokens per forward pass via discrete dif

Google DeepMind released DiffusionGemma, a 26B parameter mixture-of-experts model that generates text using discrete diffusion instead of autoregression. The model processes blocks of 256 tokens in parallel, achieving generation speeds exceeding 1100 tokens per second on H100 GPUs in low-batch settings.

June 10, 2026 · 6:06 PM3 min read

DiffusionGemma Google DeepMind discrete diffusion