model release

Google releases Gemma 4 26B with 256K context and multimodal support, free to use

TL;DR

Google DeepMind has released Gemma 4 26B A4B, a free instruction-tuned Mixture-of-Experts model with 262,144 token context window and multimodal capabilities including text, images, and video input. Despite 25.2B total parameters, only 3.8B activate per token, delivering performance comparable to larger 31B models at reduced compute cost.

2 min read
0

Google Releases Gemma 4 26B with 256K Context and Multimodal Support, Free

Google DeepMind has released Gemma 4 26B A4B, a free Mixture-of-Experts model available immediately. The model features a 262,144 token context window, native support for text, images, and video input (up to 60 seconds at 1fps), and is released under Apache 2.0 license.

Model Architecture and Performance

Gemma 4 26B employs a sparse Mixture-of-Experts architecture with 25.2B total parameters but only 3.8B active parameters per token during inference. Google claims this configuration delivers performance comparable to larger 31B dense models while requiring substantially less compute. The model is instruction-tuned and includes native function calling, structured output support, and configurable thinking/reasoning mode for step-by-step problem solving.

Multimodal Capabilities

Unlike earlier Gemma variants, Gemma 4 26B supports multimodal input across text, images, and video. Video support handles sequences up to 60 seconds sampled at 1 frame per second, enabling analysis of temporal content without requiring separate video understanding components.

Pricing and Availability

The model is available for free with zero cost per million input tokens and zero cost per million output tokens. It is accessible via OpenRouter, which routes requests across multiple providers and manages fallback routing to maximize uptime. Model weights are available for local deployment under Apache 2.0 license.

What This Means

Gemma 4 26B represents a significant shift in Google's open model strategy—pairing genuine multimodal capabilities with a sparse architecture that reduces inference costs. The 256K context window matches or exceeds most competitive models, and free pricing removes adoption barriers. For developers, this addresses a clear gap: capable open models with video understanding have been limited. The sparse MoE design is particularly relevant for cost-sensitive deployments where inference happens at scale. The reasoning mode addition suggests Google is matching OpenAI's o1-style thinking patterns in its open offerings.

Related Articles

model release

Google releases Gemma 4 31B free model with 256K context and multimodal support

Google DeepMind has released Gemma 4 31B Instruct, a free 30.7-billion parameter model with a 256K token context window, multimodal text and image input capabilities, and native function calling. The model supports configurable reasoning mode and 140+ languages, with strong performance on coding and document understanding tasks under Apache 2.0 license.

model release

Google DeepMind releases Gemma 4 with multimodal reasoning and up to 256K context window

Google DeepMind released Gemma 4, a multimodal model family supporting text, images, video, and audio with context windows up to 256K tokens. The release includes four sizes (E2B, E4B, 26B A4B, and 31B) designed for deployment from mobile devices to servers. The 31B dense model achieves 85.2% on MMLU Pro and 89.2% on AIME 2026.

model release

Google DeepMind releases Gemma 4 with four models up to 31B parameters, 256K context window

Google DeepMind released Gemma 4, an open-weights multimodal model family in four sizes (E2B, E4B, 26B A4B, 31B) with context windows up to 256K tokens and native reasoning capabilities. The 26B A4B variant uses Mixture-of-Experts architecture with 3.8B active parameters for efficient inference. All models support text, image input and handle 140+ languages with Apache 2.0 licensing.

model release

Google DeepMind releases Gemma 4 family: multimodal models from 2.3B to 31B parameters with 256K context

Google DeepMind released the Gemma 4 family of open-weights multimodal models in four sizes: E2B (2.3B effective parameters), E4B (4.5B effective), 26B A4B (3.8B active parameters), and 31B dense. All models support text and image input with 128K-256K context windows; E2B and E4B add native audio capabilities. Models feature reasoning modes, function calling, and multilingual support across 140+ languages.

Comments

Loading...