Gemma 4 31B IT Assistant (MTP Drafter)

Name: Gemma 4 31B IT Assistant (MTP Drafter)
Author: Google DeepMind

Google DeepMind🇺🇸 United States

active

Compare with other models →

Context window256K tokens

Version History

4majorMay 6, 2026

Major release introducing Gemma 4 family with four model sizes (E2B, E4B, 26B A4B MoE, 31B dense), Multi-Token Prediction drafters for 2x speedup, extended context windows up to 256K, enhanced multimodal capabilities, and improved reasoning performance.

Coverage

model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 with 31B dense model, 256K context window, and speculative decoding drafters

Google DeepMind has released Gemma 4, a family of open-weight multimodal models including a 31B dense model with 256K context window and four size variants ranging from 2.3B to 30.7B effective parameters. The release includes Multi-Token Prediction (MTP) draft models that achieve up to 2x decoding speedup through speculative decoding while maintaining identical output quality.

May 6, 2026 · 1:51 AM3 min read

google-deepmind gemma-4 multimodal