model release

Google releases Gemma 4 31B free model with 256K context and multimodal support

TL;DR

Google DeepMind has released Gemma 4 31B Instruct, a free 30.7-billion parameter model with a 256K token context window, multimodal text and image input capabilities, and native function calling. The model supports configurable reasoning mode and 140+ languages, with strong performance on coding and document understanding tasks under Apache 2.0 license.

2 min read
0

Google Releases Gemma 4 31B Free Model with Multimodal Support

Google DeepMind has released Gemma 4 31B Instruct as a free, open-source model available via OpenRouter as of April 2, 2026. The 30.7-billion parameter dense model introduces multimodal capabilities, accepting both text and image inputs while outputting text.

Key Specifications

Context Window & Performance

  • 256,144 token context window
  • Configurable thinking/reasoning mode for step-by-step problem solving
  • Native function calling support
  • Multilingual capability across 140+ languages

Availability & Pricing

  • Free to use with $0 per million input tokens and $0 per million output tokens
  • Available under Apache 2.0 license
  • Accessible via OpenRouter's API with model weights available for download
  • OpenRouter routes requests across multiple providers with automatic fallback for uptime

Capabilities

Google DeepMind claims Gemma 4 31B demonstrates strong performance on coding tasks, reasoning-heavy problems, and document understanding. The configurable reasoning mode allows users to enable step-by-step thinking processes, with reasoning details accessible in API responses for transparency into the model's internal logic.

The model supports function calling natively, enabling integration with external tools and APIs. Its 140+ language support indicates broader multilingual capability compared to earlier Gemma versions.

Technical Implementation

OpenRouter's infrastructure routes requests to optimal providers based on prompt size and parameters, with fallback systems to maximize availability. Users can enable reasoning through API parameters and preserve reasoning details across conversation turns to maintain context continuity.

Model weights are available for local deployment, giving developers options for both API-based and self-hosted usage. The Apache 2.0 license permits commercial and research applications without restriction.

What This Means

Gemma 4 31B's free release with 256K context and multimodal support directly challenges proprietary models in the mid-range segment. The zero-cost pricing and open license make it viable for cost-sensitive production deployments. The reasoning mode addition and 140+ language support suggest Google is competing on capability breadth rather than just scale. For organizations currently paying for Claude or GPT-4 access on routine tasks, this release provides a credible alternative worth evaluation—particularly for coding, document analysis, and multilingual workloads.

Related Articles

model release

Google releases Gemma 4 26B with 256K context and multimodal support, free to use

Google DeepMind has released Gemma 4 26B A4B, a free instruction-tuned Mixture-of-Experts model with 262,144 token context window and multimodal capabilities including text, images, and video input. Despite 25.2B total parameters, only 3.8B activate per token, delivering performance comparable to larger 31B models at reduced compute cost.

model release

NVIDIA releases Gemma 4 31B quantized model with 256K context, multimodal capabilities

NVIDIA has released a quantized version of Google DeepMind's Gemma 4 31B IT model, compressed to NVFP4 format for efficient inference on consumer GPUs. The 30.7B-parameter multimodal model supports 256K token context windows, handles text and image inputs with video frame processing, and maintains near-baseline performance across reasoning and coding benchmarks.

model release

Google DeepMind releases Gemma 4 with multimodal reasoning and up to 256K context window

Google DeepMind released Gemma 4, a multimodal model family supporting text, images, video, and audio with context windows up to 256K tokens. The release includes four sizes (E2B, E4B, 26B A4B, and 31B) designed for deployment from mobile devices to servers. The 31B dense model achieves 85.2% on MMLU Pro and 89.2% on AIME 2026.

model release

Google DeepMind releases Gemma 4 with four models up to 31B parameters, 256K context window

Google DeepMind released Gemma 4, an open-weights multimodal model family in four sizes (E2B, E4B, 26B A4B, 31B) with context windows up to 256K tokens and native reasoning capabilities. The 26B A4B variant uses Mixture-of-Experts architecture with 3.8B active parameters for efficient inference. All models support text, image input and handle 140+ languages with Apache 2.0 licensing.

Comments

Loading...