model release

Google releases Gemma 4 31B with 256K context and configurable reasoning mode

TL;DR

Google DeepMind has released Gemma 4 31B, a 30.7-billion-parameter multimodal model supporting text and image input. The model features a 262,144-token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages under Apache 2.0 license.

April 2, 2026 · 6:05 PM2 min read

Gemma 4 31B Instruct — Quick Specs

Context window262K tokens

Input$0.14/1M tokens

Output$0.4/1M tokens

Compare Gemma 4 31B Instruct with other models →

Google Releases Gemma 4 31B Multimodal Model

Google DeepMind has released Gemma 4 31B, a 30.7-billion-parameter dense multimodal model designed for both text and image input processing. The model launched on April 2, 2026, and is available through OpenRouter at $0.14 per million input tokens and $0.40 per million output tokens.

Key Specifications

The Gemma 4 31B Instruct variant includes a 256,144-token context window—among the largest for models in its parameter class. The model supports configurable thinking/reasoning mode, enabling step-by-step reasoning for complex tasks. It includes native function calling capabilities and multilingual support across 140+ languages.

Google is releasing the model under the Apache 2.0 open license, allowing commercial and research use with minimal restrictions.

Capabilities and Performance

According to Google DeepMind, Gemma 4 31B demonstrates particular strength in three areas: coding tasks, reasoning-heavy problems, and document understanding. The configurable reasoning mode allows developers to trade latency for reasoning depth—enabling the model to show its internal thought process before producing final answers.

The multimodal architecture supports both text and image input, though the model outputs text only. This positions it as a document analysis and visual question-answering tool.

What This Means

Gemma 4 31B enters a crowded market of open 30B-class models from Meta (Llama), Mistral, and others, but differentiates on three fronts: the massive 256K context window (useful for long document processing), the explicit reasoning mode (reflecting broader industry trend toward chain-of-thought capabilities), and the Apache 2.0 license (lowest legal friction for commercial deployment).

The pricing—$0.14/$0.40 input/output—is competitive with similar-scale open models. The 256K context window is particularly notable; it enables processing of entire codebases or lengthy documents in a single request, reducing the need for context management and retrieval systems.

For organizations deploying locally or on proprietary infrastructure, the open-source weights and permissive license remove API dependency concerns. The 30.7B parameter count positions it as deployable on consumer-grade hardware (though requiring 60GB+ VRAM for full precision).

Source: openrouter.ai ↗

google-deepmind gemma multimodal open-source reasoning function-calling 256k-context

model releaseJuly 4, 2026

Mistral releases Leanstral 1.5: 119B parameter open-source model for Lean 4 proof assistance

Mistral AI has released Leanstral 1.5, an open-source 119B parameter mixture-of-experts model designed specifically for Lean 4 proof assistance. The model features 128 experts with 4 active per token (6.5B activated parameters), a 256k token context window, and multimodal input capabilities.

model releaseJuly 1, 2026

Portugal releases Amália, open-source 9B parameter AI model trained on European Portuguese

Portugal has released Amália, its first national AI model trained specifically for European Portuguese. Built on EuroLLM-9B with 9 billion parameters, the model is fully open-source with weights, datasets, and code published under an open license. The government has committed €5.5m in initial funding through 2027.

model releaseJune 29, 2026

DeepReinforce Releases Ornith-1.0, Open-Source Agentic Coding Model in 9B to 397B Sizes

DeepReinforce has released Ornith-1.0, an MIT-licensed model designed for agentic coding tasks with variants ranging from 9B to 397B parameters. Built on top of Apache 2.0-licensed Gemma 4 and Qwen 3.5 base models, the company claims it achieves state-of-the-art performance among open-source models of comparable size on coding benchmarks.