product update

Google releases official iPhone app for running Gemma 4 models locally

TL;DR

Google launched AI Edge Gallery, an official iPhone app for running Gemma 4 models (E2B and E4B sizes) directly on device. The E2B model downloads at 2.54GB and delivers fast inference with image analysis, audio transcription up to 30 seconds, and tool calling capabilities.

2 min read
0

Google Releases Official iPhone App for On-Device Gemma 4 Models

Google has released AI Edge Gallery, an official iOS application for running Gemma 4 models directly on iPhone without cloud inference.

The app supports two Gemma 4 variants—the E2B and E4B sizes—plus select models from the Gemma 3 family. The smaller E2B model requires a 2.54GB download and executes quickly on-device while remaining "genuinely useful," according to early testing.

Core Capabilities

Beyond baseline text generation, AI Edge Gallery includes:

  • Image analysis: Users can ask questions about images using the on-device models
  • Audio transcription: Handles audio clips up to 30 seconds
  • Tool calling and agentic behavior: An "Agent Skills" demo showcases the models' ability to invoke external tools through eight interactive widgets: interactive-map, kitchen-adventure, calculate-hash, text-spinner, mood-tracker, mnemonic-password, query-wikipedia, and qr-code

The tool-calling demo shows real capability—when prompted to "Show me the Castro Theatre on a map," the model correctly called the interactive-map skill and embedded a functional Google Map. Response time for that interaction was 2.4 seconds. However, the app reportedly froze when attempting follow-up prompts in early testing.

First Official On-Device LLM App

This marks the first time a major model vendor has released an official native iOS application for end-user experimentation with on-device inference. The implementation choice to run models locally rather than via API eliminates latency and privacy concerns for inference.

A notable limitation: conversations with the app are ephemeral. The application does not provide persistent chat logs or conversation history storage, which constrains its usefulness for reference or debugging.

The HTML-based skill implementations appear to be proprietary—the source code for the eight interactive widgets is not publicly visible within the app.

What This Means

Google is demonstrating that their smaller Gemma models are production-ready for on-device deployment on consumer hardware. The release signals confidence in model performance and efficiency at scale. For developers, this app serves as a reference implementation for tool use and agentic behavior on mobile devices. The lack of conversation persistence and reported UI instability suggest this is a preview release rather than a finalized product—expect iterations addressing stability and feature depth.

Related Articles

product update

Google Gemini Mac app adding 'Spark' AI agent and voice control features in summer 2026

Google announced two major features coming to its Gemini Mac app this summer: the Spark AI agent that can automate desktop workflows and access local files, and an enhanced voice control system. Spark will be available to Google AI Ultra subscribers ($100/month) and can integrate with Workspace apps and third-party services.

product update

Google opens 'Gemini built in' program to third-party speaker manufacturers with turnkey reference designs

Google is expanding its 'Gemini built in' program to include speaker reference designs, allowing third-party manufacturers to build Gemini-powered smart speakers without lengthy development cycles. The program, which previously launched cameras through Walmart's Onn brand, now provides turnkey hardware solutions for both speakers and cameras.

product update

Perplexity upgrades Comet iOS browser with phone number actions, iPad sidebar polish, Finance Deep Dive tabs

Perplexity has released a major update to its Comet AI browser for iOS, adding eight new features including one-tap phone number actions, a redesigned iPad sidebar, and Finance Deep Dive analysis that opens as browser tabs. The update also fixes persistent bugs with recently closed tabs and deleted conversation threads.

product update

Google announces Spark AI agent, Information agents, and Android Halo at I/O 2026—all paywalled behind $100/month Ultra

Google announced multiple AI agent products at I/O 2026, including Spark for managing digital tasks, Information agents for 24/7 topic monitoring, and Android Halo for notifications. All features remain paywalled behind the $100/month Gemini Ultra plan, with free access timeline unspecified.

Comments

Loading...