ai-agents
50 articles tagged with ai-agents
Perplexity opens Personal Computer local AI agent to all Mac users after month-long waitlist
Perplexity has opened access to Personal Computer, its local AI agent software for Mac, to all users after a month-long limited release to paid subscribers. The software runs agents locally on Mac devices with access to files, native apps, and over 400 connectors, positioning itself as a safer alternative to OpenClaw.
Perplexity launches native Mac app for Personal Computer AI agent, available to Pro and Max subscribers
Perplexity AI has released a native macOS application for its Personal Computer AI agent feature. The app is now available to all Pro and Max subscribers and replaces the company's previous Mac software.
Anthropic adds 'dreaming' feature to Claude Managed Agents for automated memory refinement
Anthropic has updated Claude Managed Agents with a feature called 'dreaming' that allows agents to automatically review past interactions and refine their memories. The feature, available in research preview, can either automatically update agent memories or let developers approve changes manually.
Google tests Remy AI agent internally, designed to act autonomously across Gemini services
Google is testing Remy, an AI personal agent for Gemini that can take actions on users' behalf across Google services, according to Business Insider. The tool is currently in employee-only testing with no confirmed public release date.
Augment Code launches Cosmos, an operating system for multi-agent software development workflows
Augment Code has released Cosmos into public preview, positioning it as an operating system for agentic software development. The platform coordinates AI agents across the full software development lifecycle with shared memory, multi-model routing via their Prism system that claims 20-30% token savings, and what the company calls specialized agents that learn from team feedback.
Perplexity's Mac-Native 'Personal Computer' Platform Claims $2.8B in Labor-Equivalent Work
Perplexity CEO Aravind Srinivas revealed that the company's Mac-native Personal Computer platform has performed more than $2.8B in labor-equivalent work for Pro, Max, and Enterprise subscribers since launch. The announcement follows Apple CFO Kevan Parekh citing Perplexity as an example of developers building enterprise-grade AI assistants on Mac during Apple's Q2 2026 earnings call.
Meta building personal and business AI agents on top of Muse Spark model
Meta is developing AI agents for personal and business use that will run continuously to help users achieve goals, CEO Mark Zuckerberg said during the company's Q1 2026 earnings call. The agents will build on Meta's newly-released Muse Spark model from Meta Superintelligence Labs.
Amazon launches Quick desktop app with persistent context tracking across Google Workspace, Microsoft 365, Zoom, and Sal
Amazon has released a desktop version of its Quick AI assistant that integrates with Google Workspace, Microsoft 365, Zoom, and Salesforce, storing persistent context about user activities to automate tasks. The company also split Amazon Connect into four vertical-specific products: Connect Decisions, Connect Talent, Connect Health, and Connect Customer AI.
Lovable launches mobile vibe-coding app on iOS and Android after Apple's App Store restrictions
Lovable has launched its vibe-coding app on iOS and Android app stores, allowing users to build web apps through voice or text prompts on mobile devices. The launch comes after Apple blocked updates to competitors like Replit and Vibecode, forcing vibe-coding apps to preview generated code in web browsers rather than within the app itself.
Microsoft makes AI 'Agent Mode' default in Word, Excel, PowerPoint for all Copilot subscribers
Microsoft is rolling out Agent Mode as the default Copilot experience in Word, Excel, and PowerPoint this week for all Microsoft 365 Copilot and Premium subscribers. The feature, previously called 'vibe working,' allows the AI to execute multi-step edits directly in documents with real-time visibility into each action.
Google launches Gemini-powered browser automation for Chrome Enterprise users
Google announced auto browse capabilities for Chrome Enterprise at Google Cloud Next, enabling Gemini to automate web-based tasks like data entry, vendor comparisons, and meeting scheduling. The feature requires manual user confirmation before executing actions and will initially be available to U.S. Workspace users.
Google rebrands Vertex AI as Gemini Enterprise Agent Platform with governance tools for managing agent fleets
Google has rebranded its Vertex AI developer platform as the Gemini Enterprise Agent Platform, introducing tools for building, deploying, governing, and monitoring large-scale AI agent deployments. The platform includes Agent Studio for low-code agent creation, Agent Gateway for security enforcement, and cryptographic identity management for each agent.
Google launches Android CLI for AI agents, claims 70% token reduction and 3x faster tasks
Google has released a preview of Android CLI, a command-line tool designed specifically for AI agents to build Android applications. Google claims the tool reduces token usage by 70 percent and cuts task completion time to one-third compared to traditional methods.
Microsoft developing local AI agent to compete with open-source OpenClaw
Microsoft is testing OpenClaw-like features for Microsoft 365 Copilot aimed at enterprise customers, the company confirmed to The Information. The agent would run continuously to complete multi-step tasks over extended periods, distinguishing it from Microsoft's existing cloud-based agents like Copilot Cowork and Copilot Tasks.
AI agent skills fail in real-world conditions, researchers find testing 34,000 skills
A large-scale study testing 34,198 real-world skills reveals that AI agent performance drops drastically when moving from curated benchmarks to realistic conditions. Claude Opus 4.6 saw pass rates fall from 55.4% with hand-selected skills to 38.4% in truly realistic scenarios, while weaker models like Kimi K2.5 actually perform below their no-skill baseline.
Anthropic exits Claude Cowork research preview with enterprise features, launches Claude Managed Agents beta
Anthropic has promoted Claude Cowork from research preview to general availability, adding six enterprise features including role-based access controls, group spend limits, and usage analytics. The company simultaneously launched Claude Managed Agents in public beta—a composable API suite for building and deploying cloud-hosted agents without custom infrastructure work.
Miro adds AI agents directly to collaborative whiteboards with context awareness
Miro has launched AI Workflows, a system of AI agents that operate directly on collaborative canvases using full visual and spatial context. The feature includes Sidekicks (conversational agents) and Flows (multi-step automated workflows), accessible through the Business + AI Workflows tier at $20 per member per month with 50 AI credits included.
GitHub enables Dependabot to assign security alerts directly to AI coding agents
GitHub has extended Dependabot to allow direct assignment of security alerts to AI coding agents including Copilot, Claude, and Codex. The feature targets vulnerabilities requiring code changes beyond simple version bumps, automating remediation workflows across entire projects.
Cursor 3 rebuilds IDE around parallel AI agent fleets, moves away from classic editor layout
Cursor released version 3 of its AI coding tool with a complete interface redesign built around running multiple AI agents in parallel rather than individual code editing. The new "agent-first" interface allows developers to launch agents from desktop, mobile, web, Slack, GitHub, and Linear, with seamless switching between cloud and local environments.
Claude Code source leak reveals Anthropic working on 'Proactive' mode and autonomous payments
Anthropic's Claude Code version 2.1.88 release accidentally included a source map exposing over 512,000 lines of code and 2,000 TypeScript files. Analysis of the leaked codebase by security researchers reveals evidence of a planned 'Proactive' mode that would execute coding tasks without explicit user prompts, plus potential crypto-based autonomous payment systems.
Anthropic's Claude Code leak exposes Tamagotchi pet and always-on agent features
A source code leak in Anthropic's Claude Code 2.1.88 update exposed more than 512,000 lines of TypeScript, revealing unreleased features including a Tamagotchi-like pet interface and a KAIROS feature for background agent automation. Anthropic confirmed the leak was caused by a packaging error, not a security breach, and has since fixed the issue.
Amazon Bedrock AgentCore Evaluations now generally available for testing AI agents
Amazon Bedrock AgentCore Evaluations, a fully managed service for assessing AI agent performance, is now generally available following its public preview debut at AWS re:Invent 2025. The service addresses the core challenge that LLMs are non-deterministic—the same user query can produce different tool selections and outputs across runs—making traditional single-pass testing inadequate for reliable agent deployment.
GitHub's Copilot team uses AI agents to automate development work
GitHub's Applied Science team deployed coding agents to automate parts of their own development workflow, testing how AI agents can handle increasingly complex programming tasks. The experiment reveals practical insights into agent-driven development patterns and limitations.
Microsoft expands Copilot Cowork with AI model critique feature and cross-model comparison
Microsoft is expanding Copilot Cowork availability and introducing a Critique function that enables one AI model to review another's output. The update also includes a new Researcher agent claiming best-in-class deep research performance, outperforming Perplexity by 7 points, and a Model Council feature for direct model comparison.
Anthropic's unreleased Mythos model enables autonomous large-scale cyberattacks, officials warn
Anthropic is privately warning top government officials that its unreleased model "Mythos" makes large-scale cyberattacks significantly more likely in 2026. The model enables AI agents to operate autonomously with high sophistication to penetrate corporate, government and municipal systems. One official told Axios a large-scale attack could occur this year as employees unknowingly create security vulnerabilities through unsupervised agentic AI use.
OpenAI completes pretraining of 'Spud' model, Altman promises 'very strong' release in weeks
OpenAI has completed pretraining on a new model codenamed 'Spud,' according to an internal memo from CEO Sam Altman reported by The Information. Altman claims the company expects a 'very strong model' within weeks that can 'really accelerate the economy.' To free compute resources, OpenAI will shut down its Sora video generation app.
Anthropic's Claude Code gets auto-execution mode with built-in safety checks
Anthropic has released auto mode for Claude Code in research preview, enabling the AI to execute actions it deems safe without waiting for user approval. The feature uses built-in safeguards to block risky actions and prompt injection attacks, while automatically proceeding with safe operations.
Anthropic's Claude gains computer control in Code and Cowork tools
Anthropic has expanded Claude's autonomous capabilities to its Code and Cowork AI tools, allowing the model to control your Mac's mouse, keyboard, and display to complete tasks without manual intervention. The research preview is available now for Claude Pro and Max subscribers on macOS only, with support for other operating systems coming later.
Anthropic releases Claude computer use feature to compete with OpenClaw
Anthropic announced Monday that Claude can now complete tasks on users' computers, including opening apps, navigating browsers, and filling spreadsheets, after receiving prompts from a smartphone. The feature positions Anthropic directly against OpenClaw, the viral AI agent that went mainstream this year. The capability comes with safeguards requiring Claude to request permission before accessing new applications.
Anthropic enables Claude to control macOS desktop as research preview feature
Anthropic has introduced desktop control capabilities for Claude, allowing the AI to operate macOS, open applications, navigate browsers, and interact with spreadsheets. The feature launches as a research preview in Claude Cowork and Claude Code, currently limited to macOS, and prioritizes existing app integrations before defaulting to direct desktop control.
Anthropic enables Claude to control your Mac as research preview
Anthropic is rolling out computer control capabilities for Claude on macOS, allowing the AI to autonomously handle tasks like file navigation, clicking, and software interactions. The feature launches as a research preview for Claude Pro and Claude Max subscribers, with control available from iPhone via a new Dispatch tool.
Anthropic adds always-on channels to Claude Code, enabling async AI agent capabilities
Anthropic has added "channels" to Claude Code, enabling Claude to respond to incoming messages, webhooks, and notifications asynchronously without user intervention. The research preview supports Telegram and Discord with custom channel support, running through MCP servers with two-way communication.
Google Gemini task automation now works on phones, taking 9 minutes to order dinner
Google has launched task automation for Gemini on Pixel 10 Pro and Galaxy S26 Ultra, allowing the AI to autonomously use apps for food delivery and rideshare services. The feature works but is slow—taking approximately nine minutes to complete an order—and remains limited to a small beta subset of apps. Despite performance limitations, it represents the first practical demonstration of an AI assistant actually controlling a phone outside of controlled demos.
Xiaomi launches MiMo-V2-Pro with 1T parameters, matches Claude Opus on coding at 80% lower cost
Xiaomi shipped three AI models simultaneously designed to form a complete agent platform. MiMo-V2-Pro, a 1-trillion-parameter Mixture-of-Experts model with 42 billion active parameters per request, scores 78% on SWE-bench Verified and 81 points on ClawEval—nearly matching Claude Opus 4.6 while costing $1 per million input tokens versus $5 for Opus.
Google expands Universal Commerce Protocol with cart, catalog, and loyalty features for AI agents
Google has expanded the Universal Commerce Protocol (UCP) with shopping cart, catalog, and identity features designed for AI agents. The new capabilities enable agents to add multiple items to carts, access real-time product data including prices and availability, and preserve shopper loyalty benefits across retailers.
Okta launches agent management platform with discovery, governance, and kill-switch controls
Okta has released Okta for AI Agents, a management platform that addresses three core requirements: discovering deployed agents, monitoring their activities and access permissions, and terminating agent access when needed. The platform integrates with Salesforce, ServiceNow, Google, and AWS to import agents and their metadata, while providing continuous background scanning for unmanaged agents.
Meta's Manus launches desktop app enabling AI agents to access local files and applications
Meta's recently acquired AI startup Manus launched a desktop application enabling its AI agent to directly access local files, tools, and applications on personal computers through a 'My Computer' feature. Previously cloud-only, the move positions Manus to compete with OpenClaw, the open-source AI agent that sparked recent industry momentum. Unlike OpenClaw's free, MIT-licensed offering, Manus operates as a paid subscription service.
Perplexity launches Computer for Enterprise, claims $1.6M labor savings in internal test
Perplexity made Computer for Enterprise generally available to enterprise customers on March 12, claiming an internal study of 16,000+ queries showed $1.6 million in labor cost savings and 3.2 years of equivalent work completed in four weeks. The service integrates with Gmail, Outlook, GitHub, Linear, Slack, Notion, Snowflake, Databricks, and Salesforce, orchestrating tasks across 20 frontier models with agentic internet access.
Meta acquires Moltbook, hires AI agent platform founders for Superintelligence Labs
Meta has acquired Moltbook, a social network designed exclusively for AI agents, and hired its founders Matt Schlicht and Ben Parr to work in Meta's Superintelligence Labs run by former Scale AI CEO Alexandr Wang. The acquisition gives Meta access to Moltbook's technology for verifying agent identities and coordinating complex tasks between AI bots. The move signals Meta's intent to integrate agentic AI capabilities into its platforms, though specific plans remain undisclosed.
Perplexity launches Personal Computer: Mac mini-based AI agent with local app integration
Perplexity announced Personal Computer today, a cloud-based AI agent that runs on a continuously operating Mac mini to merge local applications with Perplexity's AI capabilities. The system operates 24/7, accessible from any device, and maintains integration with users' files and applications through a secure local connection.
Meta acquires Moltbook, social network for AI agents, hires founders into Superintelligence Labs
Meta has acquired Moltbook, a social network designed for AI agents, bringing founders Matt Schlicht and Ben Parr into Meta Superintelligence Labs under former Scale AI CEO Alexandr Wang. The move positions Meta alongside OpenAI's OpenClaw in acquiring AI agent platforms.
Gemini task automation launches in beta on Galaxy S26 Ultra
Google's task automation feature for Gemini is now live in beta on Samsung's Galaxy S26 Ultra, enabling the AI assistant to autonomously complete actions in food delivery and rideshare apps. The system can order rides, food items, and make decisions like warming pastries—stopping before final confirmation for user review.
Perplexity launches Personal Computer AI agent at $200/month for autonomous task handling
Perplexity AI has launched Personal Computer, a paid AI agent service priced at $200 per month that operates autonomously to handle emails, presentations, and application control. The service aims to provide continuous AI assistance for routine digital tasks without human intervention.
Perplexity launches Personal Computer to run AI agent on your Mac 24/7
Perplexity launched Personal Computer, an AI agent tool that runs continuously on a local Mac with full access to your files and applications. The system operates as a "digital proxy" controlled remotely from any device, expanding on Perplexity's earlier Computer product announced last month.
AI agent compromised McKinsey's internal platform in 2 hours using SQL injection
An AI agent deployed by security firm Codewall gained full read and write access to McKinsey's internal AI platform Lilli within two hours without credentials or insider knowledge. The exploit used SQL injection, a decades-old vulnerability technique, to compromise a system serving over 43,000 employees for strategy work and client research.
GitHub Copilot SDK shifts AI from text prompts to executable agent workflows
GitHub has released the Copilot SDK, positioning executable agent workflows as the successor to prompt-based AI interactions. The SDK enables developers to integrate agentic AI capabilities directly into applications rather than relying on text-based prompt-response patterns.
Google deploys Gemini AI agents to US Department of Defense for unclassified work
Google is deploying specialized AI agents built on Gemini to the Department of Defense for use on unclassified projects and operations. The expansion represents a deepening of Google's existing defense partnership and signals potential future deployment to classified government work.
Meta acquires Moltbook, a Reddit-like platform for AI agents to post and interact
Meta has acquired Moltbook, a Reddit-like platform where AI agents autonomously make and comment on posts. The Moltbook team, founded by Matt Schlict and Ben Parr earlier this year, will join Meta Superintelligence Labs. The platform runs on OpenClaw, an open-source AI assistant formerly called Moltbot.
Meta acquires Moltbook, Reddit-style platform for AI agent collaboration
Meta has acquired Moltbook, a platform built as a Reddit-style community space specifically for AI agents. The acquisition signals Meta's expanding focus on infrastructure for agent-to-agent interaction and collaboration.
Nvidia planning open-source AI agent platform ahead of developer conference
Nvidia is preparing to launch an open-source AI agent platform, according to reports ahead of the company's annual developer conference. The move mirrors approaches by competitors like OpenAI in building agent-based AI systems.