Google releases Gemma 4 31B with 256K context and configurable reasoning mode
Google DeepMind has released Gemma 4 31B, a 30.7-billion-parameter multimodal model supporting text and image input. The model features a 262,144-token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages under Apache 2.0 license.
Gemma 4 31B Instruct — Quick Specs
Google Releases Gemma 4 31B Multimodal Model
Google DeepMind has released Gemma 4 31B, a 30.7-billion-parameter dense multimodal model designed for both text and image input processing. The model launched on April 2, 2026, and is available through OpenRouter at $0.14 per million input tokens and $0.40 per million output tokens.
Key Specifications
The Gemma 4 31B Instruct variant includes a 256,144-token context window—among the largest for models in its parameter class. The model supports configurable thinking/reasoning mode, enabling step-by-step reasoning for complex tasks. It includes native function calling capabilities and multilingual support across 140+ languages.
Google is releasing the model under the Apache 2.0 open license, allowing commercial and research use with minimal restrictions.
Capabilities and Performance
According to Google DeepMind, Gemma 4 31B demonstrates particular strength in three areas: coding tasks, reasoning-heavy problems, and document understanding. The configurable reasoning mode allows developers to trade latency for reasoning depth—enabling the model to show its internal thought process before producing final answers.
The multimodal architecture supports both text and image input, though the model outputs text only. This positions it as a document analysis and visual question-answering tool.
What This Means
Gemma 4 31B enters a crowded market of open 30B-class models from Meta (Llama), Mistral, and others, but differentiates on three fronts: the massive 256K context window (useful for long document processing), the explicit reasoning mode (reflecting broader industry trend toward chain-of-thought capabilities), and the Apache 2.0 license (lowest legal friction for commercial deployment).
The pricing—$0.14/$0.40 input/output—is competitive with similar-scale open models. The 256K context window is particularly notable; it enables processing of entire codebases or lengthy documents in a single request, reducing the need for context management and retrieval systems.
For organizations deploying locally or on proprietary infrastructure, the open-source weights and permissive license remove API dependency concerns. The 30.7B parameter count positions it as deployable on consumer-grade hardware (though requiring 60GB+ VRAM for full precision).
Related Articles
Google releases Gemini Omni Flash video generation model with conversational editing, withholds speech synthesis
Google DeepMind released Gemini Omni Flash, the first model in its new Omni family that generates and edits video from image, audio, video, and text inputs. The model is rolling out to Gemini app subscribers and YouTube Shorts with a 10-second clip limit, while speech-editing capabilities remain withheld pending safety testing.
Google releases Gemini 3.5 Flash with 4x faster output and agentic capabilities, 3.5 Pro coming June
Google released Gemini 3.5 Flash today with 4x faster output token generation than competing frontier models while surpassing Gemini 3.1 Pro on coding, agentic, and multimodal benchmarks. The company announced Gemini 3.5 Pro will launch next month and introduced Gemini Omni, a new multimodal series that outputs video.
Perceptron Launches Mk1 Vision-Language Model with Video Reasoning at $0.15/$1.50 per 1M Tokens
Perceptron has released Perceptron Mk1, a vision-language model designed for video understanding and embodied reasoning tasks. The model accepts image and video inputs with 33K context window, priced at $0.15 per 1M input tokens and $1.50 per 1M output tokens, and supports structured spatial annotations on demand.
Google launches Gemini 3.5 Flash and new Omni multimodal AI family at I/O 2026
Google launched Gemini 3.5 Flash today as the default model for its Gemini app and AI Mode in Search, with Gemini 3.5 Pro following next month. The company also introduced Gemini Omni, a new multimodal AI family capable of generating video from text, photos, video, and audio inputs.
Comments
Loading...