model releaseApple

Apple releases AFM 3 lineup: 20B-parameter on-device model and cloud AI running on Google's Nvidia infrastructure

TL;DR

Apple announced five third-generation foundation models at WWDC26, headlined by AFM 3 Core Advanced—a 20-billion-parameter sparse model that runs on-device by activating only 1-4 billion parameters at a time. For the first time, Apple extended Private Cloud Compute to third-party infrastructure, with AFM 3 Cloud Pro running on Nvidia GPUs in Google Cloud.

3 min read
0

Apple releases AFM 3 lineup: 20B-parameter on-device model and cloud AI running on Google's Nvidia infrastructure

Apple announced five third-generation Apple Foundation Models (AFM) at WWDC26, including the first Apple AI model to run on third-party cloud infrastructure. The lineup spans on-device models and server-based systems, with AFM 3 Cloud Pro running on Nvidia GPUs hosted in Google Cloud.

The five models

According to Apple, the AFM 3 lineup comprises:

  • AFM 3 Core: 3-billion-parameter dense model for on-device processing
  • AFM 3 Core Advanced: 20-billion-parameter sparse model that activates 1-4 billion parameters per request, runs on-device on capable Apple silicon
  • AFM 3 Cloud: Server-based model optimized for speed and efficiency on Apple silicon servers
  • ADM 3 Cloud (Image): Diffusion-based image generation model running on Apple silicon
  • AFM 3 Cloud Pro: Most capable server model for complex reasoning and agentic tool use, runs on Nvidia GPUs in Google Cloud

Sparse architecture enables 20B parameters on-device

AFM 3 Core Advanced uses a sparse architecture based on Apple's "Instruction-Following Pruning" research from 2025. Unlike dense models that activate all parameters for every request, the sparse design selectively activates 1-4 billion of its 20 billion parameters depending on the prompt. Apple claims this approach differs from standard Mixture of Experts architectures.

The model is natively multimodal, handling audio and images alongside text. Apple restricts it to "our most capable Apple silicon systems," though specific device requirements were not disclosed.

Private Cloud Compute expands to Google infrastructure

AFM 3 Cloud Pro marks the first time Apple extended Private Cloud Compute beyond its own data centers. The model runs on Nvidia GPUs in Google Cloud while maintaining what Apple describes as "powerful security and privacy protections."

According to Apple's Security blog, the implementation includes:

  • Cryptographically verifiable, append-only ledger of all Google Cloud hardware in the Private Cloud Compute fleet
  • Software attestation rooted in at least two independent vendor roots of trust
  • Dedicated process isolation for initial network data parsing
  • Short time-to-live duration for shared inference software
  • Separate confidential VM for attested keys

Apple states it does not rely solely on confidential computing but treats "every component—from firmware through the host and guest OS stacks to application code—to be part of our trusted computing base."

Training and evaluation

Apple trained all five models starting from a common foundation before specializing for respective use cases. Training data included publicly available information, licensed third-party data, open-sourced data, dedicated studies, and synthetic data. The company states no user data or interactions were used in training, and web publishers can opt out.

Apple conducted human evaluations comparing AFM 3 models against previous generations across instruction following, truthfulness, presentation, and image understanding. The company published preference rates across different locale groups but did not release standard benchmark scores like MMLU or HumanEval.

What this means

Apple's deployment of a flagship model on third-party cloud infrastructure represents a significant shift from its historically closed approach, driven by the computational demands of frontier AI. The 20-billion-parameter on-device model with sparse activation is among the largest models designed for consumer devices, though performance benchmarks against competing on-device models remain undisclosed. The expansion of Private Cloud Compute to Google Cloud sets a precedent for privacy-preserving AI deployment across multiple cloud providers, though independent verification of these security guarantees will be critical.

Related Articles

product update

Apple's AFM Cloud Pro Model Runs on Nvidia GPUs in Google Cloud, Execs Confirm

Apple executives revealed at WWDC that its most advanced AI model, AFM Cloud Pro, runs on Nvidia GPUs deployed in Google's cloud infrastructure while maintaining Apple's privacy guarantees. The company disclosed it uses Google's Gemini frontier models to refine its own custom models built for Apple Silicon.

product update

Apple deploys 1.2T-parameter Gemini model on Nvidia Blackwell GPUs for rebuilt Siri

Apple announced at WWDC 2026 that the rebuilt Siri runs on a custom 1.2-trillion-parameter model based on Google's Gemini technology, hosted on Google Cloud servers powered by Nvidia Blackwell B200 GPUs. The company unveiled a three-tier privacy architecture and five new Apple Foundation Models to handle queries across device, private cloud, and Google Cloud infrastructure.

model release

Google releases DiffusionGemma 26B, open-weight model generates 500+ tokens/second

Google has released DiffusionGemma 26B, an open-weight text generation model under Apache 2 license. The model generates over 500 tokens/second according to testing on NVIDIA's free NIM API, where it produced 2,409 tokens in 4.4 seconds.

product update

Apple integrates Google Gemini into Xcode 27, expanding native agentic coding options

Apple's Xcode 27 adds native support for Google Gemini, joining existing integrations with Anthropic's Claude and OpenAI's Codex. The update also introduces improved interfaces, interactive planning, and multiturn Q&A capabilities for AI-assisted development.

Comments

Loading...