ai-agents
50 articles tagged with ai-agents
GitHub details Qubot, internal Copilot-powered data analytics agent for plain language queries
GitHub has released technical details on Qubot, an internal analytics agent powered by GitHub Copilot that enables employees to query company data using natural language. The agent represents GitHub's implementation of AI-assisted data analysis for internal operations.
Canva adds Perplexity Computer connector for autonomous design asset creation
Canva launched a connector for Perplexity Computer that enables the AI agent platform to autonomously create editable design assets based on user prompts and data. The integration is available for Perplexity Pro, Max, Enterprise Pro, and Enterprise Max subscribers.
Google's Gemini Spark AI agent uses personal data to plan trips, raising privacy concerns
Google's Gemini Spark, an AI agent rolling out to the company's $99/month AI Ultra plan, demonstrates advanced capabilities by mining users' Gmail, calendar, photos, and location data to create detailed trip itineraries. The agent can perform actions across apps and operate computers, though third-party services like Airbnb currently block its booking attempts.
AWS adds Policy Engine and Lambda interceptors to Bedrock AgentCore gateway for agent security controls
Amazon Web Services launched Policy Engine and Lambda interceptors for Bedrock AgentCore gateway, enabling enterprises to control which tools AI agents can access and validate requests dynamically. The Policy Engine uses Cedar declarative policy language for deterministic access decisions, while Lambda interceptors run custom code before or after each tool call for validation, token exchange, and response filtering.
Mistral AI launches Le Chat Enterprise with new Medium 3 model, enterprise search and agent builders
Mistral AI has launched Le Chat Enterprise, powered by its new Mistral Medium 3 model. The platform includes enterprise search across Google Drive, Sharepoint, OneDrive, Gmail and Google Calendar, no-code agent builders, custom data connectors, and hybrid deployment options including self-hosted and cloud.
Google announces Spark AI agent, Information agents, and Android Halo at I/O 2026—all paywalled behind $100/month Ultra
Google announced multiple AI agent products at I/O 2026, including Spark for managing digital tasks, Information agents for 24/7 topic monitoring, and Android Halo for notifications. All features remain paywalled behind the $100/month Gemini Ultra plan, with free access timeline unspecified.
Google cuts AI Ultra plan to $200/month, launches new $100 developer tier
Google announced pricing changes to its Gemini AI subscription tiers at I/O 2026, cutting its top AI Ultra plan from $250 to $200 per month while introducing a new $100/month developer-focused tier. All plans now get access to Gemini 3.5 Flash and the new Gemini Omni video generation model.
Google launches Universal Cart, an AI agent that shops across multiple retailers in one checkout
Google announced Universal Cart at its I/O developer conference, an AI-powered shopping system that consolidates purchases from multiple retailers including Target, Shopify, Wayfair, and Etsy into a single checkout. The feature uses Gemini's agentic AI to verify product compatibility, suggest better deals, and automate routine purchases.
Google releases Gemini 3.5 Flash globally, claims it's 'strongest agentic and coding model yet'
Google released Gemini 3.5 Flash at its I/O 2026 conference, immediately deploying it as the default model in AI Mode and the Gemini app globally. The company claims it delivers improved agentic capabilities and coding performance at less than half the cost of competing frontier models. Gemini 3.5 Pro is scheduled for release in June.
Google launches Universal Cart with Gemini integration for multi-retailer shopping across Search, Gmail, YouTube
Google announced Universal Cart, a Gemini-powered shopping tool that aggregates products from multiple retailers across Search, Gmail, YouTube, and Gemini. The system tracks price changes, identifies product incompatibilities, and enables single-checkout purchases across retailers including Walmart, Target, Nike, Shopify, and Wayfair.
Google Search adds AI agents, generative UI, and conversational search box powered by Gemini 3.5 Flash
Google announced major Search updates at I/O 2026, including AI Mode now powered by Gemini 3.5 Flash serving over 1 billion monthly users. The company is launching background information agents that monitor the web 24/7 and generate custom mini-apps, both features reserved for Google AI Pro and Ultra subscribers.
Google Search deploys Gemini Flash 3.5 for AI-generated interfaces, agent-based information gathering
Google announced a fundamental restructuring of Search, replacing traditional ranked links with AI-generated interfaces powered by Gemini Flash 3.5. The update introduces information agents that monitor the web 24/7 and custom mini-apps built through natural language, with the generative UI rolling out free to all users this summer.
Google releases Gemini 3.5 Flash at half the price of frontier models, announces Omni world model
Google released Gemini 3.5 Flash, priced at half to one-third the cost of comparable frontier models, and announced it will become the default model in the Gemini app globally. The company also unveiled Omni, a world model for simulating physical environments, and Gemini Spark, an AI agent in beta testing.
Google names upcoming Gemini AI agent 'Spark,' adds autonomous task execution to mobile app
Google is preparing to launch Gemini Spark, an autonomous AI agent that will operate within the Gemini mobile app. According to code found in Google app beta version 17.23, Spark can access connected apps, personal data, and location to execute tasks like managing inboxes and scheduling meetings, though Google warns it may occasionally act without permission.
Notion launches Developer Platform with custom code execution, agent orchestration, and database sync
Notion has launched a Developer Platform that allows teams to run custom code in cloud-based Workers, sync external databases, and orchestrate both internal and external AI agents. The platform, free through August, supports integration with Claude Code, Cursor, Codex, and Decagon, and uses Model Context Protocol for agent connectivity.
Google embeds Gemini across Android as multimodal agent before Apple's WWDC AI reveal
Google is rolling out Gemini-powered features across Android that enable the AI to understand screen context and complete multi-step tasks across apps, marking a shift from traditional assistant interactions to agentic capabilities. The updates will launch on Samsung Galaxy and Google Pixel phones this summer before expanding to watches, cars, and laptops.
Xcode 26.5 adds message queuing and clarifying questions for AI coding assistants
Apple released Xcode 26.5 with two new Coding Intelligence features: the ability to queue multiple messages to AI coding assistants without waiting for responses, and agent support for asking clarifying questions before executing tasks. The update builds on agentic coding capabilities introduced in Xcode 26.3, which allowed developers to integrate tools like OpenAI Codex and Anthropic's Claude directly into the IDE.
Perplexity opens Personal Computer local AI agent to all Mac users after month-long waitlist
Perplexity has opened access to Personal Computer, its local AI agent software for Mac, to all users after a month-long limited release to paid subscribers. The software runs agents locally on Mac devices with access to files, native apps, and over 400 connectors, positioning itself as a safer alternative to OpenClaw.
Perplexity launches native Mac app for Personal Computer AI agent, available to Pro and Max subscribers
Perplexity AI has released a native macOS application for its Personal Computer AI agent feature. The app is now available to all Pro and Max subscribers and replaces the company's previous Mac software.
Anthropic adds 'dreaming' feature to Claude Managed Agents for automated memory refinement
Anthropic has updated Claude Managed Agents with a feature called 'dreaming' that allows agents to automatically review past interactions and refine their memories. The feature, available in research preview, can either automatically update agent memories or let developers approve changes manually.
Google tests Remy AI agent internally, designed to act autonomously across Gemini services
Google is testing Remy, an AI personal agent for Gemini that can take actions on users' behalf across Google services, according to Business Insider. The tool is currently in employee-only testing with no confirmed public release date.
Augment Code launches Cosmos, an operating system for multi-agent software development workflows
Augment Code has released Cosmos into public preview, positioning it as an operating system for agentic software development. The platform coordinates AI agents across the full software development lifecycle with shared memory, multi-model routing via their Prism system that claims 20-30% token savings, and what the company calls specialized agents that learn from team feedback.
Perplexity's Mac-Native 'Personal Computer' Platform Claims $2.8B in Labor-Equivalent Work
Perplexity CEO Aravind Srinivas revealed that the company's Mac-native Personal Computer platform has performed more than $2.8B in labor-equivalent work for Pro, Max, and Enterprise subscribers since launch. The announcement follows Apple CFO Kevan Parekh citing Perplexity as an example of developers building enterprise-grade AI assistants on Mac during Apple's Q2 2026 earnings call.
Meta building personal and business AI agents on top of Muse Spark model
Meta is developing AI agents for personal and business use that will run continuously to help users achieve goals, CEO Mark Zuckerberg said during the company's Q1 2026 earnings call. The agents will build on Meta's newly-released Muse Spark model from Meta Superintelligence Labs.
Amazon launches Quick desktop app with persistent context tracking across Google Workspace, Microsoft 365, Zoom, and Sal
Amazon has released a desktop version of its Quick AI assistant that integrates with Google Workspace, Microsoft 365, Zoom, and Salesforce, storing persistent context about user activities to automate tasks. The company also split Amazon Connect into four vertical-specific products: Connect Decisions, Connect Talent, Connect Health, and Connect Customer AI.
Lovable launches mobile vibe-coding app on iOS and Android after Apple's App Store restrictions
Lovable has launched its vibe-coding app on iOS and Android app stores, allowing users to build web apps through voice or text prompts on mobile devices. The launch comes after Apple blocked updates to competitors like Replit and Vibecode, forcing vibe-coding apps to preview generated code in web browsers rather than within the app itself.
Microsoft makes AI 'Agent Mode' default in Word, Excel, PowerPoint for all Copilot subscribers
Microsoft is rolling out Agent Mode as the default Copilot experience in Word, Excel, and PowerPoint this week for all Microsoft 365 Copilot and Premium subscribers. The feature, previously called 'vibe working,' allows the AI to execute multi-step edits directly in documents with real-time visibility into each action.
Google launches Gemini-powered browser automation for Chrome Enterprise users
Google announced auto browse capabilities for Chrome Enterprise at Google Cloud Next, enabling Gemini to automate web-based tasks like data entry, vendor comparisons, and meeting scheduling. The feature requires manual user confirmation before executing actions and will initially be available to U.S. Workspace users.
Google rebrands Vertex AI as Gemini Enterprise Agent Platform with governance tools for managing agent fleets
Google has rebranded its Vertex AI developer platform as the Gemini Enterprise Agent Platform, introducing tools for building, deploying, governing, and monitoring large-scale AI agent deployments. The platform includes Agent Studio for low-code agent creation, Agent Gateway for security enforcement, and cryptographic identity management for each agent.
Google launches Android CLI for AI agents, claims 70% token reduction and 3x faster tasks
Google has released a preview of Android CLI, a command-line tool designed specifically for AI agents to build Android applications. Google claims the tool reduces token usage by 70 percent and cuts task completion time to one-third compared to traditional methods.
Microsoft developing local AI agent to compete with open-source OpenClaw
Microsoft is testing OpenClaw-like features for Microsoft 365 Copilot aimed at enterprise customers, the company confirmed to The Information. The agent would run continuously to complete multi-step tasks over extended periods, distinguishing it from Microsoft's existing cloud-based agents like Copilot Cowork and Copilot Tasks.
AI agent skills fail in real-world conditions, researchers find testing 34,000 skills
A large-scale study testing 34,198 real-world skills reveals that AI agent performance drops drastically when moving from curated benchmarks to realistic conditions. Claude Opus 4.6 saw pass rates fall from 55.4% with hand-selected skills to 38.4% in truly realistic scenarios, while weaker models like Kimi K2.5 actually perform below their no-skill baseline.
Anthropic exits Claude Cowork research preview with enterprise features, launches Claude Managed Agents beta
Anthropic has promoted Claude Cowork from research preview to general availability, adding six enterprise features including role-based access controls, group spend limits, and usage analytics. The company simultaneously launched Claude Managed Agents in public beta—a composable API suite for building and deploying cloud-hosted agents without custom infrastructure work.
Miro adds AI agents directly to collaborative whiteboards with context awareness
Miro has launched AI Workflows, a system of AI agents that operate directly on collaborative canvases using full visual and spatial context. The feature includes Sidekicks (conversational agents) and Flows (multi-step automated workflows), accessible through the Business + AI Workflows tier at $20 per member per month with 50 AI credits included.
GitHub enables Dependabot to assign security alerts directly to AI coding agents
GitHub has extended Dependabot to allow direct assignment of security alerts to AI coding agents including Copilot, Claude, and Codex. The feature targets vulnerabilities requiring code changes beyond simple version bumps, automating remediation workflows across entire projects.
Cursor 3 rebuilds IDE around parallel AI agent fleets, moves away from classic editor layout
Cursor released version 3 of its AI coding tool with a complete interface redesign built around running multiple AI agents in parallel rather than individual code editing. The new "agent-first" interface allows developers to launch agents from desktop, mobile, web, Slack, GitHub, and Linear, with seamless switching between cloud and local environments.
Claude Code source leak reveals Anthropic working on 'Proactive' mode and autonomous payments
Anthropic's Claude Code version 2.1.88 release accidentally included a source map exposing over 512,000 lines of code and 2,000 TypeScript files. Analysis of the leaked codebase by security researchers reveals evidence of a planned 'Proactive' mode that would execute coding tasks without explicit user prompts, plus potential crypto-based autonomous payment systems.
Anthropic's Claude Code leak exposes Tamagotchi pet and always-on agent features
A source code leak in Anthropic's Claude Code 2.1.88 update exposed more than 512,000 lines of TypeScript, revealing unreleased features including a Tamagotchi-like pet interface and a KAIROS feature for background agent automation. Anthropic confirmed the leak was caused by a packaging error, not a security breach, and has since fixed the issue.
Amazon Bedrock AgentCore Evaluations now generally available for testing AI agents
Amazon Bedrock AgentCore Evaluations, a fully managed service for assessing AI agent performance, is now generally available following its public preview debut at AWS re:Invent 2025. The service addresses the core challenge that LLMs are non-deterministic—the same user query can produce different tool selections and outputs across runs—making traditional single-pass testing inadequate for reliable agent deployment.
GitHub's Copilot team uses AI agents to automate development work
GitHub's Applied Science team deployed coding agents to automate parts of their own development workflow, testing how AI agents can handle increasingly complex programming tasks. The experiment reveals practical insights into agent-driven development patterns and limitations.
Microsoft expands Copilot Cowork with AI model critique feature and cross-model comparison
Microsoft is expanding Copilot Cowork availability and introducing a Critique function that enables one AI model to review another's output. The update also includes a new Researcher agent claiming best-in-class deep research performance, outperforming Perplexity by 7 points, and a Model Council feature for direct model comparison.
Anthropic's unreleased Mythos model enables autonomous large-scale cyberattacks, officials warn
Anthropic is privately warning top government officials that its unreleased model "Mythos" makes large-scale cyberattacks significantly more likely in 2026. The model enables AI agents to operate autonomously with high sophistication to penetrate corporate, government and municipal systems. One official told Axios a large-scale attack could occur this year as employees unknowingly create security vulnerabilities through unsupervised agentic AI use.
OpenAI completes pretraining of 'Spud' model, Altman promises 'very strong' release in weeks
OpenAI has completed pretraining on a new model codenamed 'Spud,' according to an internal memo from CEO Sam Altman reported by The Information. Altman claims the company expects a 'very strong model' within weeks that can 'really accelerate the economy.' To free compute resources, OpenAI will shut down its Sora video generation app.
Anthropic's Claude Code gets auto-execution mode with built-in safety checks
Anthropic has released auto mode for Claude Code in research preview, enabling the AI to execute actions it deems safe without waiting for user approval. The feature uses built-in safeguards to block risky actions and prompt injection attacks, while automatically proceeding with safe operations.
Anthropic's Claude gains computer control in Code and Cowork tools
Anthropic has expanded Claude's autonomous capabilities to its Code and Cowork AI tools, allowing the model to control your Mac's mouse, keyboard, and display to complete tasks without manual intervention. The research preview is available now for Claude Pro and Max subscribers on macOS only, with support for other operating systems coming later.
Anthropic releases Claude computer use feature to compete with OpenClaw
Anthropic announced Monday that Claude can now complete tasks on users' computers, including opening apps, navigating browsers, and filling spreadsheets, after receiving prompts from a smartphone. The feature positions Anthropic directly against OpenClaw, the viral AI agent that went mainstream this year. The capability comes with safeguards requiring Claude to request permission before accessing new applications.
Anthropic enables Claude to control macOS desktop as research preview feature
Anthropic has introduced desktop control capabilities for Claude, allowing the AI to operate macOS, open applications, navigate browsers, and interact with spreadsheets. The feature launches as a research preview in Claude Cowork and Claude Code, currently limited to macOS, and prioritizes existing app integrations before defaulting to direct desktop control.
Anthropic enables Claude to control your Mac as research preview
Anthropic is rolling out computer control capabilities for Claude on macOS, allowing the AI to autonomously handle tasks like file navigation, clicking, and software interactions. The feature launches as a research preview for Claude Pro and Claude Max subscribers, with control available from iPhone via a new Dispatch tool.
Anthropic adds always-on channels to Claude Code, enabling async AI agent capabilities
Anthropic has added "channels" to Claude Code, enabling Claude to respond to incoming messages, webhooks, and notifications asynchronously without user intervention. The research preview supports Telegram and Discord with custom channel support, running through MCP servers with two-way communication.
Google Gemini task automation now works on phones, taking 9 minutes to order dinner
Google has launched task automation for Gemini on Pixel 10 Pro and Galaxy S26 Ultra, allowing the AI to autonomously use apps for food delivery and rideshare services. The feature works but is slow—taking approximately nine minutes to complete an order—and remains limited to a small beta subset of apps. Despite performance limitations, it represents the first practical demonstration of an AI assistant actually controlling a phone outside of controlled demos.