product update

Google releases official iPhone app for running Gemma 4 models locally

TL;DR

Google launched AI Edge Gallery, an official iPhone app for running Gemma 4 models (E2B and E4B sizes) directly on device. The E2B model downloads at 2.54GB and delivers fast inference with image analysis, audio transcription up to 30 seconds, and tool calling capabilities.

2 min read
0

Google Releases Official iPhone App for On-Device Gemma 4 Models

Google has released AI Edge Gallery, an official iOS application for running Gemma 4 models directly on iPhone without cloud inference.

The app supports two Gemma 4 variants—the E2B and E4B sizes—plus select models from the Gemma 3 family. The smaller E2B model requires a 2.54GB download and executes quickly on-device while remaining "genuinely useful," according to early testing.

Core Capabilities

Beyond baseline text generation, AI Edge Gallery includes:

  • Image analysis: Users can ask questions about images using the on-device models
  • Audio transcription: Handles audio clips up to 30 seconds
  • Tool calling and agentic behavior: An "Agent Skills" demo showcases the models' ability to invoke external tools through eight interactive widgets: interactive-map, kitchen-adventure, calculate-hash, text-spinner, mood-tracker, mnemonic-password, query-wikipedia, and qr-code

The tool-calling demo shows real capability—when prompted to "Show me the Castro Theatre on a map," the model correctly called the interactive-map skill and embedded a functional Google Map. Response time for that interaction was 2.4 seconds. However, the app reportedly froze when attempting follow-up prompts in early testing.

First Official On-Device LLM App

This marks the first time a major model vendor has released an official native iOS application for end-user experimentation with on-device inference. The implementation choice to run models locally rather than via API eliminates latency and privacy concerns for inference.

A notable limitation: conversations with the app are ephemeral. The application does not provide persistent chat logs or conversation history storage, which constrains its usefulness for reference or debugging.

The HTML-based skill implementations appear to be proprietary—the source code for the eight interactive widgets is not publicly visible within the app.

What This Means

Google is demonstrating that their smaller Gemma models are production-ready for on-device deployment on consumer hardware. The release signals confidence in model performance and efficiency at scale. For developers, this app serves as a reference implementation for tool use and agentic behavior on mobile devices. The lack of conversation persistence and reported UI instability suggest this is a preview release rather than a finalized product—expect iterations addressing stability and feature depth.

Related Articles

product update

Google rolls out major Android redesign for Gemini overlay and Gemini Live

Google is deploying major redesigns to its Gemini overlay and Gemini Live on Android, introducing a floating interface for Live and a consolidated menu system for the overlay. The updates are rolling out in Google app beta version 17.3 and represent the third major visual overhaul in as many months.

product update

Google redesigns Gemini's crisis response after suicide lawsuit

Google is redesigning how Gemini handles mental health crises with a one-touch interface connecting users to 988 crisis services. The update comes months after a lawsuit alleged the chatbot encouraged a man's suicide, and includes retrained responses designed to avoid validating harmful beliefs.

product update

Google redesigns Gemini's crisis intervention interface following wrongful death lawsuit

Google has redesigned Gemini's crisis intervention module to provide faster access to mental health resources through a simplified one-touch interface. The update follows a wrongful death lawsuit alleging the chatbot coached a user toward suicide, adding pressure on AI companies to improve safeguards for vulnerable users.

product update

Google adds crisis detection and hotline routing to Gemini for mental health support

Google announced updates to Gemini designed to detect mental health crises and connect users to hotline resources through one-touch calling, chat, text, or website access. The company is simultaneously committing $30 million over three years to support global hotlines and mental health training platforms.

Comments

Loading...