model release

Google releases Gemma 4, open-source on-device AI with agentic tool use for phones

TL;DR

Google released Gemma 4, an open-source multimodal model that runs entirely on smartphones without sending data to the cloud. The E2B and E4B variants require just 6GB and 8GB of RAM respectively and can autonomously use tools like Wikipedia, maps, and QR code generators through built-in agent skills. The model is available free via the Google AI Edge Gallery app for Android and iOS.

3 min read
0

Google releases Gemma 4, open-source on-device AI with agentic tool use for phones

Google released Gemma 4, an open-source multimodal AI model that processes text, images, and audio entirely on-device with autonomous tool-use capabilities. The model family includes four variants optimized for different hardware, with the smallest versions running on smartphones with as little as 6GB RAM.

Model specifications and variants

Gemma 4 ships in four sizes: E2B and E4B for smartphones, plus 26B and 31B models for servers. The "E" designates "effective parameters," referring to parameters active during inference rather than total parameter count.

The E2B variant consumes approximately 1.3GB quantized storage and runs on devices with 6GB RAM, while E4B requires roughly 2.5GB and 8GB RAM respectively. Google optimized both versions with Arm and Qualcomm for current mobile processors. According to Google, Gemma 4 on Android runs up to 4x faster than the previous generation while reducing battery consumption by up to 60 percent. Arm's benchmarks report even larger gains—an average 5.5x speedup on devices with newer Arm chips supporting the SME2 instruction set.

The 26B variant uses a mixture-of-experts architecture with 128 experts, keeping only 3.8 billion parameters active at inference time. The dense 31B model offers a 256,000-token context window.

All models process text, images, and audio across more than 140 languages. The entire Gemma family has achieved over 400 million downloads since initial release, according to Google.

On-device agentic capabilities

Gemma 4's defining feature is autonomous tool use without cloud connectivity. The bundled Google AI Edge Gallery app includes "agent skills"—built-in tools the model can independently invoke: Wikipedia search, interactive maps, auto-generated summaries, flashcard generation, and QR code generation. The model can describe photos, convert spoken input into diagrams and visualizations, and coordinate with other local models for text-to-speech or image generation.

The model automatically infers user intent and activates the appropriate skill. While tool invocation requires internet connectivity, the model itself runs entirely locally, and conversation history never persists.

Developers can create custom skills via GitHub and share them with the community. The app requires Android 12 or iOS 17 and has already reached fourth place among the most-downloaded free productivity apps in the iOS App Store, behind Claude, Gemini, and ChatGPT.

Google's demos show improvements in optical character recognition and time-aware reasoning, capabilities important for calendar, reminder, and alarm functionality.

Licensing and platform strategy

Gemma 4 releases under the commercially friendly Apache 2.0 license. Google built the models on research underlying its proprietary Gemini 3 system but made them freely available as open-source.

The E2B and E4B variants serve as the foundation for Gemini Nano 4, the next generation of Android's system-wide on-device model. Code written for Gemma 4 will work with Gemini Nano 4 upon release on flagship devices later in 2025. Gemini Nano currently runs on over 140 million Android devices, powering features like Smart Replies and audio summaries.

In December 2024, Google previewed a related approach with FunctionGemma, a 270-million-parameter model that translates natural language into structured function calls for phone tasks—toggling flashlights, creating contacts, managing calendars, and opening settings.

What this means

Gemma 4 marks a significant shift in on-device AI strategy. By combining unrestricted open-source licensing with genuine agentic capabilities at scale, Google enables developers to build privacy-preserving applications without cloud dependencies. The 4x speed improvements and 60 percent battery gains make the technology practical for mainstream phones. The model's integration pathway into Gemini Nano signals Google's commitment to making on-device AI standard across Android. For users, this means AI assistance that never transmits conversations to servers—a direct response to privacy concerns and a competitive move against cloud-dependent systems.

Related Articles

model release

Liquid AI releases LFM2.5-VL-450M, improved 450M-parameter vision-language model with multilingual support

Liquid AI has released LFM2.5-VL-450M, a refreshed 450M-parameter vision-language model built on an updated LFM2.5-350M backbone. The model features a 32,768-token context window, supports 9 languages, handles native 512×512 pixel images, and adds bounding box prediction and function calling capabilities. Performance improvements span both vision and language benchmarks compared to its predecessor.

model release

Tencent releases HY-Embodied-0.5, a 2B-parameter vision-language model for robot control

Tencent has released HY-Embodied-0.5, a family of foundation models designed specifically for embodied AI and robotic control. The suite includes a 2B-parameter MoT (Mixture-of-Transformers) variant with only 2.2B activated parameters during inference, and a 32B model that claims frontier-level performance comparable to Gemini 3.0 Pro, trained on over 200 billion tokens of embodied-specific data.

model release

Meta launches proprietary Muse Spark, abandoning open-source strategy after $14.3B rebuild

Meta launched Muse Spark on April 8, 2026, a natively multimodal reasoning model with tool-use and visual chain-of-thought capabilities. Unlike Llama, it is entirely proprietary with no open weights. The model scores 52 on AI Index v4.0 and excels on health benchmarks but represents Meta's departure from its open-source identity.

model release

Meta AI app jumps to No. 5 on App Store following Muse Spark launch

Meta's AI app surged from No. 57 to No. 5 on the U.S. App Store within 24 hours of launching Muse Spark, Meta's new multimodal AI model. The model accepts voice, text, and image inputs and features reasoning capabilities for science and math tasks, visual coding, and multi-agent functionality.

Comments

Loading...