model release

Google launches Gemma 4 open-weights models with Apache 2.0 license to compete with Chinese LLMs

TL;DR

Google released Gemma 4, a new line of open-weights models available in sizes from 2 billion to 31 billion parameters, under a permissive Apache 2.0 license. The release includes multimodal capabilities, support for 140+ languages, native function calling, and a 256,000-token context window for the larger variants.

April 2, 2026 · 9:35 PM3 min read

Gemma 4 — Quick Specs

Context window256K tokens

Compare Gemma 4 with other models →

Google Launches Gemma 4 Open-Weights Models with Apache 2.0 License

Google released Gemma 4, a new family of open-weights large language models designed to compete directly with Chinese open-source models from Moonshot AI, Alibaba, and Z.AI that increasingly rival proprietary alternatives. The shift to a permissive Apache 2.0 license marks Google's most significant licensing change for the Gemma family, removing previous restrictions that gave Google the right to terminate access.

Model Lineup and Specifications

Gemma 4 comes in multiple sizes across three categories:

High-performance dense model: A 31-billion-parameter model tuned for output quality, featuring a 256,000-token context window. Google claims it runs unquantized at 16-bit precision on a single 80 GB H100 GPU and at 4-bit precision on consumer GPUs like the Nvidia RTX 4090 or AMD RX 7900 XTX using frameworks such as Llama.cpp or Ollama.

Mixture of Experts variant: A 26-billion-parameter model using a mixture of experts (MoE) architecture with 3.8 billion active parameters per token. The model prioritizes inference speed over output quality and also features a 256,000-token context window.

Edge models: Two smaller models optimized for smartphones and single-board computers like Raspberry Pi, with 2-billion and 4-billion effective parameters (5.1 and 8 billion actual parameters, respectively, using per-layer embeddings). These retain 128,000-token context windows and multimodal capabilities.

Key Capabilities

All Gemma 4 variants support:

Multimodality: Video, audio, and image inputs alongside text
Multilingual support: Over 140 languages
Native function calling: Structured output generation
Advanced reasoning: Improvements in mathematical and instruction-following tasks

Google provides benchmark comparisons against Gemma 3 showing "significant performance improvements across a variety of AI benchmarks," though specific scores were not disclosed in the announcement.

Licensing and Deployment Strategy

The shift from Google's previous custom license to Apache 2.0 removes restrictions on deployment scenarios and eliminates Google's ability to revoke access. This addresses enterprise concerns about vendor control and data sovereignty—a critical differentiator against proprietary models where training data usage remains opaque.

Gemma 4 models are immediately available through:

Google AI Studio
Google AI Edge Gallery
Hugging Face
Kaggle
Ollama

Google claims day-one support across 12+ inference frameworks including vLLM, SGLang, Llama.cpp, and MLX.

Market Context

The release directly responds to the emergence of competitive open-weights Chinese models. Models like Moonshot AI's offerings and Alibaba's implementations now reportedly match or exceed the capabilities of OpenAI's GPT-5 and Anthropic's Claude on certain benchmarks. By offering a domestic alternative with clear licensing terms, Google aims to secure enterprise adoption where data residency, cost sensitivity, and licensing flexibility drive decision-making.

The 31-billion-parameter ceiling positions Gemma 4 below Google's proprietary Gemini models, eliminating cannibalization risk while remaining accessible to enterprises that cannot afford the infrastructure costs of larger models.

What This Means

Gemma 4 represents Google's strategic shift toward open-weights licensing as a competitive moat against both proprietary competitors and Chinese open-source alternatives. The Apache 2.0 license removes the licensing friction that previously made enterprises cautious about adopting Google's models. For developers and enterprises, the multimodal support, code-optimized variants, and 256K context window address two critical use cases: local code assistants and agentic AI. However, Google has not disclosed specific performance benchmarks, making it difficult to assess quality claims against established competitors.

Source: go.theregister.com ↗

gemma-4 google-deepmind open-weights apache-2-0 multimodal language-models enterprise-ai chinese-competition

model releaseMay 20, 2026

Google releases Gemini Omni Flash video generation model with conversational editing, withholds speech synthesis

Google DeepMind released Gemini Omni Flash, the first model in its new Omni family that generates and edits video from image, audio, video, and text inputs. The model is rolling out to Gemini app subscribers and YouTube Shorts with a 10-second clip limit, while speech-editing capabilities remain withheld pending safety testing.

model releaseMay 19, 2026

Google releases Gemini 3.5 Flash with 4x faster output and agentic capabilities, 3.5 Pro coming June

Google released Gemini 3.5 Flash today with 4x faster output token generation than competing frontier models while surpassing Gemini 3.1 Pro on coding, agentic, and multimodal benchmarks. The company announced Gemini 3.5 Pro will launch next month and introduced Gemini Omni, a new multimodal series that outputs video.

model releaseMay 20, 2026

Stability AI Releases Stable Audio 3.0 Model Family Trained on Licensed Data

Stability AI has released Stable Audio 3.0, a model family for audio generation trained on fully licensed data. The company positions the release as a foundation for commercial audio applications, though specific technical specifications have not yet been disclosed.

model releaseMay 19, 2026

Google launches Gemini 3.5 Flash and new Omni multimodal AI family at I/O 2026

Google launched Gemini 3.5 Flash today as the default model for its Gemini app and AI Mode in Search, with Gemini 3.5 Pro following next month. The company also introduced Gemini Omni, a new multimodal AI family capable of generating video from text, photos, video, and audio inputs.

Google launches Gemma 4 open-weights models with Apache 2.0 license to compete with Chinese LLMs

Gemma 4 — Quick Specs

Google Launches Gemma 4 Open-Weights Models with Apache 2.0 License

Model Lineup and Specifications

Key Capabilities

Licensing and Deployment Strategy

Market Context

What This Means

Related Articles

Google releases Gemini Omni Flash video generation model with conversational editing, withholds speech synthesis

Google releases Gemini 3.5 Flash with 4x faster output and agentic capabilities, 3.5 Pro coming June

Stability AI Releases Stable Audio 3.0 Model Family Trained on Licensed Data

Google launches Gemini 3.5 Flash and new Omni multimodal AI family at I/O 2026

Comments