Apple gains full Gemini access, uses distillation to build lightweight on-device models
Apple has secured full access to Google's Gemini models within its data centers and is using knowledge distillation to generate training data for smaller, on-device AI models. The approach allows Apple to create lightweight versions that replicate Gemini's reasoning patterns while running directly on Apple devices, requiring significantly less processing power.
Apple gains full Gemini access, uses distillation to build lightweight on-device models
Apple has secured broad access rights to Google's Gemini models, according to reporting from The Information. The company now has full access to Gemini within its own data centers and, critically, permission to use knowledge distillation—a technique for extracting capabilities from larger models into smaller ones.
How the distillation approach works
Apple is leveraging Gemini to generate high-quality training data by extracting both answers and reasoning chains from the larger model. This output serves as training data for smaller models that Apple builds internally. The result: lightweight models that deliver identical answers and reasoning paths as Gemini while consuming far less computational resources.
These distilled versions can run directly on Apple devices without requiring cloud connectivity, a key advantage for privacy and latency-sensitive applications.
The strategic play
This approach mirrors tactics allegedly used by Chinese AI companies, but with a critical difference—Apple has paid for legitimate access rights to Gemini's outputs. The arrangement reflects a pragmatic strategy: rather than building reasoning capabilities from scratch, Apple taps Google's foundation models to train its own smaller variants optimized for device-side execution.
According to The Information, Gemini's design around chatbot and enterprise use cases doesn't perfectly align with Apple's Siri integration goals. This mismatch has motivated Apple to continue building its own models in parallel through its Apple Foundation Models team.
Timeline and expectations
Apple is expected to announce new AI features during its Worldwide Developers Conference in June 2026. The distillation work appears designed to power these announcements with practical, on-device capabilities.
The full scope of Apple's Gemini licensing agreement—including pricing, usage restrictions, and exclusivity terms—remains undisclosed.
What this means
Apple is adopting a hybrid approach: leveraging frontier models from established leaders (Google) while investing in proprietary on-device optimization. This reduces Apple's need to develop world-class reasoning capabilities independently while maintaining control over the user-facing models deployed on its devices. The strategy positions Apple to ship differentiated AI features by WWDC without bearing the full R&D cost of training large foundation models from scratch. For Google, the deal provides both revenue and a partnership with a major AI integration partner.
Related Articles
Gemini now imports chats and memory from ChatGPT, Claude, and other AI apps
Google is rolling out chat and memory import functionality to Gemini, allowing users to transfer conversation history from ChatGPT, Claude, and other AI apps. The feature supports zip file uploads up to 5 GB, with users able to upload up to 5 files per day. A companion memory import tool lets users generate context summaries from other chatbots to paste into Gemini.
Google launches Search Live globally, powered by Gemini 3.1 Flash Live
Google is rolling out Search Live globally, its conversational search feature powered by Gemini 3.1 Flash Live, which supports over 90 languages. Simultaneously, Google Translate's live headphones translation mode is launching on iOS after its Android debut, supporting over 70 languages across seven new countries.
Google's Gemini app now creates 3-minute songs with Lyria 3 Pro
Google announced Lyria 3 Pro, expanding the Gemini app's music generation capability from 30-second tracks to full 3-minute songs. The model improves structural understanding of musical composition, allowing users to prompt for specific elements like intros, verses, choruses, and bridges. Available now for Gemini subscribers with tier-based daily limits (10-50 tracks/day) and in Vertex AI, Google AI Studio, and the Gemini API for developers.
Google expands Search Live to 200+ countries with multilingual Gemini 3.1 Flash Live
Google is expanding Search Live, its voice and camera-based AI search assistant, to more than 200 countries and territories with support for dozens of languages. The expansion is powered by Gemini 3.1 Flash Live, a new audio-focused model that Google claims offers faster response times and more natural conversations.
Comments
Loading...