product update

Google AI Edge Gallery launches on macOS with Gemma 4 12B, 12-billion-parameter model for local inference

TL;DR

Google launched AI Edge Gallery for macOS, allowing Mac users to run Google's Gemma models locally. The platform ships with five Gemma models, including the newly released Gemma 4 12B—a 12-billion-parameter multimodal model that handles text, vision, and audio while running on consumer laptops with 16GB of RAM.

2 min read
0

Google AI Edge Gallery launches on macOS with Gemma 4 12B, 12-billion-parameter model for local inference

Google released AI Edge Gallery for macOS on June 3, 2026, enabling Mac users to run Google's Gemma models locally on their devices. The platform currently supports five models, with the flagship Gemma 4 12B representing Google's most capable consumer-focused local model to date.

Gemma 4 12B: specifications and capabilities

Gemma 4 12B is a 12-billion-parameter model that Google claims delivers performance comparable to its 26-billion-parameter mixture-of-experts model. According to Google, the model runs on consumer laptops with 16GB of RAM.

The model is multimodal, processing text, vision, and audio inputs. Google states it includes "good coding capabilities" for extracting insights from data on-device. No benchmark scores or pricing information have been disclosed, as the model is designed for local deployment rather than API access.

Available models

Google AI Edge Gallery for Mac currently offers access to five models, all instruction-tuned variants:

  • Gemma-4-12B-it
  • Gemma-4-E2B-it
  • Gemma-4-E4B-it
  • Gemma-3n-E2B-it
  • Gemma-3n-E4B-it

Unlike platforms such as Ollama and LM Studio, which support thousands of open models from various providers, Google AI Edge Gallery is limited to Google's Gemma family.

Google AI Edge Eloquent

Google also launched AI Edge Eloquent for macOS, a dictation app that transcribes speech while editing for clarity and removing disfluencies. The app processes audio on-device and supports custom vocabulary for names and technical terms. Eloquent previously launched on iOS earlier in 2026.

What this means

Google is positioning itself as a competitor to established local inference platforms while maintaining a walled garden approach—only Google models are supported. The 12-billion-parameter size of Gemma 4 12B is notably larger than most consumer-focused local models, which typically range from 2 billion to 9 billion parameters. However, without published benchmarks or independent testing, claims about performance parity with larger models remain unverified. The 16GB RAM requirement makes Gemma 4 12B accessible to recent MacBook Air and MacBook Pro users, but excludes older hardware. Google's simultaneous release of a practical application (Eloquent) alongside developer tools (AI Edge Gallery) suggests a dual strategy targeting both general consumers and technical users.

Comments

Loading...