product update

Google AI Edge Gallery launches on macOS with Gemma 4 12B, 12-billion-parameter model for local inference

TL;DR

Google launched AI Edge Gallery for macOS, allowing Mac users to run Google's Gemma models locally. The platform ships with five Gemma models, including the newly released Gemma 4 12B—a 12-billion-parameter multimodal model that handles text, vision, and audio while running on consumer laptops with 16GB of RAM.

June 4, 2026 · 3:05 AM2 min read

Gemma 4 12B — Quick Specs

Compare Gemma 4 12B with other models →

Google AI Edge Gallery launches on macOS with Gemma 4 12B, 12-billion-parameter model for local inference

Google released AI Edge Gallery for macOS on June 3, 2026, enabling Mac users to run Google's Gemma models locally on their devices. The platform currently supports five models, with the flagship Gemma 4 12B representing Google's most capable consumer-focused local model to date.

Gemma 4 12B: specifications and capabilities

Gemma 4 12B is a 12-billion-parameter model that Google claims delivers performance comparable to its 26-billion-parameter mixture-of-experts model. According to Google, the model runs on consumer laptops with 16GB of RAM.

The model is multimodal, processing text, vision, and audio inputs. Google states it includes "good coding capabilities" for extracting insights from data on-device. No benchmark scores or pricing information have been disclosed, as the model is designed for local deployment rather than API access.

Available models

Google AI Edge Gallery for Mac currently offers access to five models, all instruction-tuned variants:

Gemma-4-12B-it
Gemma-4-E2B-it
Gemma-4-E4B-it
Gemma-3n-E2B-it
Gemma-3n-E4B-it

Unlike platforms such as Ollama and LM Studio, which support thousands of open models from various providers, Google AI Edge Gallery is limited to Google's Gemma family.

Google AI Edge Eloquent

Google also launched AI Edge Eloquent for macOS, a dictation app that transcribes speech while editing for clarity and removing disfluencies. The app processes audio on-device and supports custom vocabulary for names and technical terms. Eloquent previously launched on iOS earlier in 2026.

What this means

Google is positioning itself as a competitor to established local inference platforms while maintaining a walled garden approach—only Google models are supported. The 12-billion-parameter size of Gemma 4 12B is notably larger than most consumer-focused local models, which typically range from 2 billion to 9 billion parameters. However, without published benchmarks or independent testing, claims about performance parity with larger models remain unverified. The 16GB RAM requirement makes Gemma 4 12B accessible to recent MacBook Air and MacBook Pro users, but excludes older hardware. Google's simultaneous release of a practical application (Eloquent) alongside developer tools (AI Edge Gallery) suggests a dual strategy targeting both general consumers and technical users.

Source: 9to5mac.com ↗

Google Gemma local inference macOS on-device AI multimodal dictation

product updateJuly 16, 2026

Google prepares voice customization for Gemini with speed, energy, formality, and warmth controls

Google is preparing to let users customize Gemini's voice output across four parameters: speed, energy, formality, and warmth, according to code discovered in the Google app 17.41.12 beta. The controls will apply to both Gemini Live and standard chat interactions.

product updateJuly 16, 2026

Google renames NotebookLM to Gemini Notebook, adds code execution for Ultra and Workspace users

Google is renaming its AI note-taking app NotebookLM to Gemini Notebook while keeping it as a standalone app. The company is also rolling out code execution capabilities to Google AI Ultra and Workspace business customers, with Pro user access coming in the following weeks.

product updateJuly 16, 2026

Google Rebrands NotebookLM to Gemini Notebook, Brings Gemini 3.5 and Antigravity to AI Pro

Google renamed NotebookLM to Gemini Notebook and announced that the Gemini 3.5 model with Antigravity code execution capability will roll out to AI Pro subscribers in the coming weeks. The research tool now has over 30 million users and 600,000+ organizations.

product updateJuly 15, 2026

Google Gemini Spark adds Workspace editing, gets 50% speed boost, expands to AI Ultra subscribers

Google has upgraded its Gemini Spark personal agent with the ability to edit shared Google Workspace documents, a 50% speed improvement, and smarter parallel source processing. The service is now available to Google AI Ultra subscribers in most regions, with AI Pro access planned for the near future.

Google AI Edge Gallery launches on macOS with Gemma 4 12B, 12-billion-parameter model for local inference

Gemma 4 12B — Quick Specs

Google AI Edge Gallery launches on macOS with Gemma 4 12B, 12-billion-parameter model for local inference

Gemma 4 12B: specifications and capabilities

Available models

Google AI Edge Eloquent

What this means

Related Articles

Google prepares voice customization for Gemini with speed, energy, formality, and warmth controls

Google renames NotebookLM to Gemini Notebook, adds code execution for Ultra and Workspace users

Google Rebrands NotebookLM to Gemini Notebook, Brings Gemini 3.5 and Antigravity to AI Pro

Google Gemini Spark adds Workspace editing, gets 50% speed boost, expands to AI Ultra subscribers

Comments