model release

Google releases Gemma 4, open-source on-device AI with agentic tool use for phones

TL;DR

Google released Gemma 4, an open-source multimodal model that runs entirely on smartphones without sending data to the cloud. The E2B and E4B variants require just 6GB and 8GB of RAM respectively and can autonomously use tools like Wikipedia, maps, and QR code generators through built-in agent skills. The model is available free via the Google AI Edge Gallery app for Android and iOS.

April 11, 2026 · 1:35 PM3 min read

Gemma 4 E2B — Quick Specs

Compare Gemma 4 E2B with other models →

Google releases Gemma 4, open-source on-device AI with agentic tool use for phones

Google released Gemma 4, an open-source multimodal AI model that processes text, images, and audio entirely on-device with autonomous tool-use capabilities. The model family includes four variants optimized for different hardware, with the smallest versions running on smartphones with as little as 6GB RAM.

Model specifications and variants

Gemma 4 ships in four sizes: E2B and E4B for smartphones, plus 26B and 31B models for servers. The "E" designates "effective parameters," referring to parameters active during inference rather than total parameter count.

The E2B variant consumes approximately 1.3GB quantized storage and runs on devices with 6GB RAM, while E4B requires roughly 2.5GB and 8GB RAM respectively. Google optimized both versions with Arm and Qualcomm for current mobile processors. According to Google, Gemma 4 on Android runs up to 4x faster than the previous generation while reducing battery consumption by up to 60 percent. Arm's benchmarks report even larger gains—an average 5.5x speedup on devices with newer Arm chips supporting the SME2 instruction set.

The 26B variant uses a mixture-of-experts architecture with 128 experts, keeping only 3.8 billion parameters active at inference time. The dense 31B model offers a 256,000-token context window.

All models process text, images, and audio across more than 140 languages. The entire Gemma family has achieved over 400 million downloads since initial release, according to Google.

On-device agentic capabilities

Gemma 4's defining feature is autonomous tool use without cloud connectivity. The bundled Google AI Edge Gallery app includes "agent skills"—built-in tools the model can independently invoke: Wikipedia search, interactive maps, auto-generated summaries, flashcard generation, and QR code generation. The model can describe photos, convert spoken input into diagrams and visualizations, and coordinate with other local models for text-to-speech or image generation.

The model automatically infers user intent and activates the appropriate skill. While tool invocation requires internet connectivity, the model itself runs entirely locally, and conversation history never persists.

Developers can create custom skills via GitHub and share them with the community. The app requires Android 12 or iOS 17 and has already reached fourth place among the most-downloaded free productivity apps in the iOS App Store, behind Claude, Gemini, and ChatGPT.

Google's demos show improvements in optical character recognition and time-aware reasoning, capabilities important for calendar, reminder, and alarm functionality.

Licensing and platform strategy

Gemma 4 releases under the commercially friendly Apache 2.0 license. Google built the models on research underlying its proprietary Gemini 3 system but made them freely available as open-source.

The E2B and E4B variants serve as the foundation for Gemini Nano 4, the next generation of Android's system-wide on-device model. Code written for Gemma 4 will work with Gemini Nano 4 upon release on flagship devices later in 2025. Gemini Nano currently runs on over 140 million Android devices, powering features like Smart Replies and audio summaries.

In December 2024, Google previewed a related approach with FunctionGemma, a 270-million-parameter model that translates natural language into structured function calls for phone tasks—toggling flashlights, creating contacts, managing calendars, and opening settings.

What this means

Gemma 4 marks a significant shift in on-device AI strategy. By combining unrestricted open-source licensing with genuine agentic capabilities at scale, Google enables developers to build privacy-preserving applications without cloud dependencies. The 4x speed improvements and 60 percent battery gains make the technology practical for mainstream phones. The model's integration pathway into Gemini Nano signals Google's commitment to making on-device AI standard across Android. For users, this means AI assistance that never transmits conversations to servers—a direct response to privacy concerns and a competitive move against cloud-dependent systems.

Source: the-decoder.com ↗

google gemma-4 on-device-ai open-source mobile-ai agent-skills multimodal privacy

model releaseJuly 9, 2026

NVIDIA Releases Audex-30B-A3B: Unified Audio-Text Model With 1M Token Context and Speech Generation

NVIDIA released Audex-30B-A3B, a unified audio-text model built on the Nemotron-Cascade-2-30B-A3B backbone. The model handles audio understanding, speech recognition and translation, text-to-speech, audio generation, and speech-to-speech while supporting up to 1M token context length.

model releaseJuly 8, 2026

OpenAI Launches GPT-Live Voice Model That Delegates Complex Tasks to GPT-5.5

OpenAI has replaced ChatGPT's voice mode with GPT-Live, a new voice model that can delegate complex tasks to GPT-5.5 in the background. The previous voice mode was based on a GPT-4o era model with a 2024 knowledge cutoff.

model releaseJuly 10, 2026

Meta stock surges 15% as company releases Muse Spark 1.1 agentic model and Muse Image generator

Meta's stock surged 15% this week following the release of two AI models: Muse Spark 1.1 for agentic and coding workloads on Thursday, and Muse Image for image generation on Tuesday. The releases come three months after Meta introduced its first foundation model, Muse Spark, as the company competes with OpenAI, Anthropic, and Google.

model releaseJuly 10, 2026

OpenAI releases GPT-5.6 in three versions as COO Fidji Simo departs after 11 months

OpenAI released GPT-5.6 Thursday in three versions—Luna, Terra, and Sol—with Sol claiming benchmark wins over Anthropic's Claude Fable on coding tasks. The launch coincides with COO Fidji Simo's departure less than a year after joining, citing worsening health issues.

Google releases Gemma 4, open-source on-device AI with agentic tool use for phones

Gemma 4 E2B — Quick Specs

Google releases Gemma 4, open-source on-device AI with agentic tool use for phones

Model specifications and variants

On-device agentic capabilities

Licensing and platform strategy

What this means

Related Articles

NVIDIA Releases Audex-30B-A3B: Unified Audio-Text Model With 1M Token Context and Speech Generation

OpenAI Launches GPT-Live Voice Model That Delegates Complex Tasks to GPT-5.5

Meta stock surges 15% as company releases Muse Spark 1.1 agentic model and Muse Image generator

OpenAI releases GPT-5.6 in three versions as COO Fidji Simo departs after 11 months

Comments