on-device-ai
13 articles tagged with on-device-ai
WSJ's Joanna Stern Tests iOS 27's Rebuilt Siri for One Week, Reports Major Improvements in Personal Context Understandin
Joanna Stern, former Wall Street Journal tech columnist, tested Apple's rebuilt Siri in iOS 27 for one week and reports substantial improvements. The assistant now pulls context from Messages, Calendar, and voicemail to deliver personalized responses—though limitations remain in current beta.
Apple ships 20-billion-parameter model that runs from iPhone flash storage using expert pruning
Apple detailed its third-generation Foundation Models family: five models including AFM 3 Core Advanced, a 20-billion-parameter on-device model that keeps most parameters in flash storage and loads only 1-4 billion at a time into memory. The models were custom-built with Google and trained on Google's TPUs.
Perplexity Computer adds hybrid inference to split tasks between local and cloud models
Perplexity announced that its Computer agentic system will gain hybrid inference in July 2026, automatically splitting tasks between local models for sensitive data and cloud-based frontier models for complex operations. The feature aims to balance privacy with computational power without requiring manual model selection.
Apple to upgrade on-device image models in iOS 27, add third-party AI image generation support
Apple plans to significantly improve the visual quality of its on-device image generation models for Genmoji and Image Playground in iOS 27, according to Bloomberg's Mark Gurman. The update will also add support for third-party AI image generation models beyond OpenAI's ChatGPT.
Chrome installs 4GB Gemini Nano model file for on-device AI features without clear user notice
Google Chrome is automatically downloading a 4GB model file for its Gemini Nano-powered AI features, causing unexpected storage usage on user devices. The weights.bin file enables on-device AI capabilities like scam detection and writing assistance, but users report receiving no clear notification about the storage requirements.
Google quietly releases COSMO experimental AI assistant app with local Gemini Nano
Google published COSMO, an experimental AI assistant application for Android, on the Play Store. The 1.13 GB app runs Gemini Nano locally and includes 14 automated skills ranging from calendar event scheduling to document writing and deep research.
Google releases Gemma 4, open-source on-device AI with agentic tool use for phones
Google released Gemma 4, an open-source multimodal model that runs entirely on smartphones without sending data to the cloud. The E2B and E4B variants require just 6GB and 8GB of RAM respectively and can autonomously use tools like Wikipedia, maps, and QR code generators through built-in agent skills. The model is available free via the Google AI Edge Gallery app for Android and iOS.
Google releases AI Edge Eloquent, offline voice dictation app with no subscriptions
Google has released Google AI Edge Eloquent, a new iOS app that converts speech into polished text entirely on-device. The app offers unlimited usage with no subscription, real-time transcription, and optional Gemini integration for enhanced text refinement.
Google DeepMind releases Gemma 4, open multimodal models with 256K context and reasoning
Google DeepMind has released Gemma 4, a family of open-weights multimodal models ranging from 2.3B to 31B parameters with support for text, images, video, and audio. The models feature context windows up to 256K tokens, built-in reasoning modes, and native function calling for agentic workflows.
Google previews Gemini Nano 4 for Android, arriving on flagship devices this year
Google has previewed Gemini Nano 4, a new on-device language model for Android, available now in early access via AICore Developer Preview. The model comes in two versions: Gemini Nano 4 Fast (3x faster than previous models, 60% less battery) and Gemini Nano 4 Full (higher reasoning capability). The models will launch on new flagship Android devices later this year.
NVIDIA Optimizes Google Gemma 4 for Local Agentic AI on RTX and Spark
NVIDIA has optimized Google's Gemma 4 models for local deployment on RTX and Spark platforms, targeting the emerging wave of on-device agentic AI. The optimization enables small, efficient models to access real-time local context for autonomous decision-making without cloud dependency.
Apple gains full Gemini access, uses distillation to build lightweight on-device models
Apple has secured full access to Google's Gemini models within its data centers and is using knowledge distillation to generate training data for smaller, on-device AI models. The approach allows Apple to create lightweight versions that replicate Gemini's reasoning patterns while running directly on Apple devices, requiring significantly less processing power.
Stability AI releases Stable Audio Open Small for on-device audio generation with Arm
Stability AI has open-sourced Stable Audio Open Small in partnership with Arm, a smaller and faster variant of its text-to-audio model designed for on-device deployment. The model maintains output quality and prompt adherence while reducing computational requirements for real-world edge deployment on devices powered by Arm's technology, which runs on 99% of smartphones globally.