model release

Google releases Gemma 4 31B free model with 256K context and multimodal support

TL;DR

Google DeepMind has released Gemma 4 31B Instruct, a free 30.7-billion parameter model with a 256K token context window, multimodal text and image input capabilities, and native function calling. The model supports configurable reasoning mode and 140+ languages, with strong performance on coding and document understanding tasks under Apache 2.0 license.

2 min read
0

Google Releases Gemma 4 31B Free Model with Multimodal Support

Google DeepMind has released Gemma 4 31B Instruct as a free, open-source model available via OpenRouter as of April 2, 2026. The 30.7-billion parameter dense model introduces multimodal capabilities, accepting both text and image inputs while outputting text.

Key Specifications

Context Window & Performance

  • 256,144 token context window
  • Configurable thinking/reasoning mode for step-by-step problem solving
  • Native function calling support
  • Multilingual capability across 140+ languages

Availability & Pricing

  • Free to use with $0 per million input tokens and $0 per million output tokens
  • Available under Apache 2.0 license
  • Accessible via OpenRouter's API with model weights available for download
  • OpenRouter routes requests across multiple providers with automatic fallback for uptime

Capabilities

Google DeepMind claims Gemma 4 31B demonstrates strong performance on coding tasks, reasoning-heavy problems, and document understanding. The configurable reasoning mode allows users to enable step-by-step thinking processes, with reasoning details accessible in API responses for transparency into the model's internal logic.

The model supports function calling natively, enabling integration with external tools and APIs. Its 140+ language support indicates broader multilingual capability compared to earlier Gemma versions.

Technical Implementation

OpenRouter's infrastructure routes requests to optimal providers based on prompt size and parameters, with fallback systems to maximize availability. Users can enable reasoning through API parameters and preserve reasoning details across conversation turns to maintain context continuity.

Model weights are available for local deployment, giving developers options for both API-based and self-hosted usage. The Apache 2.0 license permits commercial and research applications without restriction.

What This Means

Gemma 4 31B's free release with 256K context and multimodal support directly challenges proprietary models in the mid-range segment. The zero-cost pricing and open license make it viable for cost-sensitive production deployments. The reasoning mode addition and 140+ language support suggest Google is competing on capability breadth rather than just scale. For organizations currently paying for Claude or GPT-4 access on routine tasks, this release provides a credible alternative worth evaluation—particularly for coding, document analysis, and multilingual workloads.

Related Articles

model release

Cohere Releases Command A+ Open Source Model with 25B Active Parameters, 128K Context

Cohere has released Command A+ as an open source model under Apache 2.0 license. The sparse mixture-of-experts architecture features 25 billion active parameters out of 218B total parameters, supports 128K input context length, and includes vision capabilities alongside tool use and reasoning features.

model release

Cohere Releases Command A+: 218B-Parameter MoE Model With 4-Bit Quantization Runs on Single B200 GPU

Cohere has released Command A+, an open-source sparse mixture-of-experts model with 218 billion total parameters and 25 billion active parameters. The model features W4A4 quantization allowing deployment on a single Nvidia B200 GPU, supports 128K input context, and includes built-in chain-of-thought reasoning with vision capabilities.

model release

Google releases Gemini Omni Flash video generation model with conversational editing, withholds speech synthesis

Google DeepMind released Gemini Omni Flash, the first model in its new Omni family that generates and edits video from image, audio, video, and text inputs. The model is rolling out to Gemini app subscribers and YouTube Shorts with a 10-second clip limit, while speech-editing capabilities remain withheld pending safety testing.

model release

Google releases Gemini 3.5 Flash with 4x faster output and agentic capabilities, 3.5 Pro coming June

Google released Gemini 3.5 Flash today with 4x faster output token generation than competing frontier models while surpassing Gemini 3.1 Pro on coding, agentic, and multimodal benchmarks. The company announced Gemini 3.5 Pro will launch next month and introduced Gemini Omni, a new multimodal series that outputs video.

Comments

Loading...