model release

Google releases Gemma 4 31B with 256K context and configurable reasoning mode

TL;DR

Google DeepMind has released Gemma 4 31B, a 30.7-billion-parameter multimodal model supporting text and image input. The model features a 262,144-token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages under Apache 2.0 license.

2 min read
0

Gemma 4 31B Instruct — Quick Specs

Context window262K tokens
Input$0.14/1M tokens
Output$0.4/1M tokens

Google Releases Gemma 4 31B Multimodal Model

Google DeepMind has released Gemma 4 31B, a 30.7-billion-parameter dense multimodal model designed for both text and image input processing. The model launched on April 2, 2026, and is available through OpenRouter at $0.14 per million input tokens and $0.40 per million output tokens.

Key Specifications

The Gemma 4 31B Instruct variant includes a 256,144-token context window—among the largest for models in its parameter class. The model supports configurable thinking/reasoning mode, enabling step-by-step reasoning for complex tasks. It includes native function calling capabilities and multilingual support across 140+ languages.

Google is releasing the model under the Apache 2.0 open license, allowing commercial and research use with minimal restrictions.

Capabilities and Performance

According to Google DeepMind, Gemma 4 31B demonstrates particular strength in three areas: coding tasks, reasoning-heavy problems, and document understanding. The configurable reasoning mode allows developers to trade latency for reasoning depth—enabling the model to show its internal thought process before producing final answers.

The multimodal architecture supports both text and image input, though the model outputs text only. This positions it as a document analysis and visual question-answering tool.

What This Means

Gemma 4 31B enters a crowded market of open 30B-class models from Meta (Llama), Mistral, and others, but differentiates on three fronts: the massive 256K context window (useful for long document processing), the explicit reasoning mode (reflecting broader industry trend toward chain-of-thought capabilities), and the Apache 2.0 license (lowest legal friction for commercial deployment).

The pricing—$0.14/$0.40 input/output—is competitive with similar-scale open models. The 256K context window is particularly notable; it enables processing of entire codebases or lengthy documents in a single request, reducing the need for context management and retrieval systems.

For organizations deploying locally or on proprietary infrastructure, the open-source weights and permissive license remove API dependency concerns. The 30.7B parameter count positions it as deployable on consumer-grade hardware (though requiring 60GB+ VRAM for full precision).

Related Articles

model release

Google DeepMind releases Gemma 4: multimodal models up to 31B parameters with 256K context

Google DeepMind released the Gemma 4 family of open-weights multimodal models in four sizes: E2B (2.3B effective), E4B (4.5B effective), 26B A4B (25.2B total, 3.8B active), and 31B dense. All models support text and image input with 128K-256K context windows, reasoning modes, and native function calling for agentic workflows.

model release

Google DeepMind releases Gemma 4 with 4 model sizes, 256K context, and multimodal reasoning

Google DeepMind released Gemma 4, a family of open-weights multimodal models in four sizes: E2B (2.3B effective), E4B (4.5B effective), 26B A4B (3.8B active), and 31B (30.7B parameters). All models support text and image input with 128K-256K context windows, while E2B and E4B add native audio capabilities and reasoning modes across 140+ languages.

model release

Google DeepMind releases Gemma 4 open models with multimodal capabilities and 256K context window

Google DeepMind released the Gemma 4 family of open-source models with multimodal capabilities (text, image, audio, video) and context windows up to 256K tokens. Four distinct model sizes—E2B (2.3B effective parameters), E4B (4.5B effective), 26B A4B (3.8B active), and 31B—are available under the Apache 2.0 license, with instruction-tuned and pre-trained variants.

model release

Google releases Gemma 4 family with 31B model, 256K context, multimodal capabilities

Google DeepMind released the Gemma 4 family of open-weights models ranging from 2.3B to 31B parameters, featuring up to 256K token context windows and native support for text, image, video, and audio inputs. The flagship 31B model scores 85.2% on MMLU Pro and 89.2% on AIME 2026, with a smaller 26B MoE variant requiring only 3.8B active parameters for faster inference.

Comments

Loading...