Google preps Gemini agent for macOS to control computers and organize files, challenging Claude Cowork
Google is developing a Gemini agent for macOS that will control computers, organize files, and integrate with Google Workspace apps. Code analysis reveals features including file conversion to Google Sheets, folder organization, batch file renaming, and meeting follow-up automation.
Google Preps Gemini Agent for macOS to Control Computers and Organize Files
Google is developing a Gemini agent for macOS with computer control capabilities to rival Anthropic's Claude Cowork, according to code analysis by 9to5Google's APK Insight team.
The agent will use macOS Screen Access and Accessibility features to control mouse and keyboard inputs, similar to Claude Cowork's computer use functionality released in recent months.
Four Core Capabilities Revealed
Code strings reveal four example prompts that define the agent's initial scope:
- Convert files to spreadsheets: Scan local folders for invoices or reports, extract data, and structure it into Google Sheets
- Organize folders: Find unorganized files in Desktop or Downloads, group by type or context, and archive
- Standardize files: Read file metadata and batch-rename files into organized subfolders
- Meeting follow-up: Extract Meet transcripts or Docs notes from recent meetings and draft follow-up emails with highlights and action items
The first three capabilities focus on local file management and Google Workspace integration. The fourth shifts focus to Google's productivity suite including Meet, Docs, and Gmail.
Current Gemini macOS App Limited
The existing Gemini macOS app offers two features: a native chat interface and an alt-space shortcut that allows sharing the current window for visual context. The upcoming agent capabilities would represent a significant expansion.
Broader Than Android Implementation
The macOS agent appears positioned for broader capabilities than Gemini's Android implementation. Currently, only select Android devices like the Galaxy S26 series can automate simple in-app tasks such as food ordering.
Google previously demonstrated computer control capabilities in the Gemini 2.5 Computer Use preview released last year, but has not yet shipped production features.
What This Means
Google is playing catch-up to Anthropic's Claude Cowork in the computer control agent space, focusing initially on file management and Google Workspace integration rather than general-purpose automation. The emphasis on Google Workspace suggests the company plans to leverage its productivity suite as a competitive advantage, potentially making the agent more valuable for organizations already committed to Google's ecosystem. No release timeline has been announced.
Related Articles
Google Gemini Live gains access to Memory and Connected Apps from past conversations
Google has updated Gemini Live to access past conversation history through Memory and Connected Apps. The feature, currently available in English in the US, allows the voice assistant to reference previous chats and information from YouTube, Workspace, Utilities, and image generation tools during conversations.
AWS Releases AgentCore Harness for Production AI Agents with Two-API Setup
Amazon Web Services made its AgentCore harness generally available, reducing production AI agent deployment to two API calls: CreateHarness and InvokeHarness. The managed service handles sandboxed execution, memory, tool integration, and observability, eliminating infrastructure setup for teams building LLM agents.
U.S. government orders Anthropic to halt exports of Mythos and Fable AI models, both now offline for one week
The White House ordered Anthropic to restrict exports of its Mythos and Fable AI models last Friday, citing national security concerns. Anthropic pulled both models offline within 90 minutes of the Commerce Department directive, marking the first major test of AI export controls.
GitHub details Qubot, internal Copilot-powered data analytics agent for plain language queries
GitHub has released technical details on Qubot, an internal analytics agent powered by GitHub Copilot that enables employees to query company data using natural language. The agent represents GitHub's implementation of AI-assisted data analysis for internal operations.
Comments
Loading...