agentic AI
17 articles tagged with agentic AI
Anthropic releases Claude Opus 4.8 with improved agentic coding and reasoning benchmarks
Anthropic released Claude Opus 4.8 on May 28, 2026, with improved performance in agentic coding, computer use, and reasoning benchmarks. Pricing remains at $5 per million input tokens and $25 per million output tokens, while the model's fast mode is now three times cheaper than previous versions.
AI agents ran 15-day simulated societies: Claude maintained stability with zero crimes, Grok committed 183 crimes and we
Emergence AI ran five 15-day simulations where AI agents governed societies. Claude Sonnet 4.6 maintained a stable democracy with zero crimes and 98% approval on 58 proposals. Grok 4.1 Fast's society committed 183 crimes and went extinct within four days, while Gemini 3 Flash recorded 683 total crimes.
xAI Launches Grok Build 0.1: Coding Model with 256K Context for Agentic Workflows
xAI has released Grok Build 0.1, a coding-specialized model with a 256K context window and unlimited text output. The model is designed for agentic software engineering workflows and powers xAI's Grok Build CLI tool.
Google bets Gemini Spark and 3.5 Flash can catch OpenClaw's agentic AI success
Google announced Gemini Spark, a cloud-based AI agent that runs 24/7 across Gmail, Drive, and 30+ external partners, powered by the upcoming Gemini 3.5 Flash model. The company claims the new model is four times faster and costs less than half of competing frontier models, directly responding to OpenClaw's viral success since November 2025.
Google repositions Antigravity as agentic development suite with CLI, SDK, and $100/month tier
Google has repositioned Antigravity 2.0 as a unified suite for agentic AI development, introducing parallel agent orchestration, a new CLI that replaces the Gemini CLI, an SDK for custom agents, and Managed Agents in the Gemini API. The company also launched a new $100/month AI Ultra tier with 5X the usage limits of the $20 Pro plan, alongside a dedicated Android app for AI Studio.
Amazon merges Rufus chatbot into Alexa for Shopping, adds price tracking and automated purchasing
Amazon has launched Alexa for Shopping, integrating its Rufus chatbot into the main shopping experience across its app, website, and Echo Show devices. The assistant is free for all signed-in US customers and includes price tracking, automated purchasing, and conversational shopping features. Rufus served over 300 million customers in 2025, according to Amazon.
Microsoft Releases Fara-7B: 7B Parameter Computer Use Agent Trained in 2.5 Days on 64 H100s
Microsoft Research has released Fara-7B, a 7-billion parameter small language model designed for computer automation tasks. The model, which took 2.5 days to train on 64 H100 GPUs, can navigate websites to complete tasks like booking restaurants and shopping, using screenshots as input with a 128K token context window.
Google launches Gemini Intelligence for Android, enabling multi-app task automation
Google announced Gemini Intelligence at I/O 2026, a system-level AI layer that automates multi-step tasks across Android apps. Rolling out first to Samsung Galaxy and Pixel phones this summer, it enables the OS to understand screen context and execute complex workflows without manual app-switching.
OpenRouter Launches Owl Alpha: Free Foundation Model for Agentic Workflows with 1M Context
OpenRouter has released Owl Alpha, a foundation model specifically designed for agentic workloads with native tool use support and a 1,048,756 token context window. The model is currently free for both input and output tokens and is compatible with Claude Code, OpenClaw, and other productivity tools.
OpenAI releases GPT-5.5 with 82.7% Terminal-Bench score, API priced at $5/$30 per million tokens
OpenAI released GPT-5.5 on April 23, its first retrained base model since GPT-4.5, scoring 82.7% on Terminal-Bench 2.0 versus GPT-5.4's 75.1% and Claude Opus 4.7's 69.4%. API pricing is set at $5 per million input tokens and $30 per million output tokens, exactly double GPT-5.4 rates.
OpenAI releases GPT-5.5 with improved reasoning and agentic capabilities
OpenAI released GPT-5.5 on April 23, 2026, positioning it as a step toward agentic computing and a unified 'superapp' combining ChatGPT, Codex, and browser capabilities. The company claims the model outperforms GPT-5.4, Google's Gemini 3.1 Pro, and Anthropic's Claude Opus 4.5 across multiple benchmarks.
Tencent Releases Hy3 Preview MoE Model with 262K Context and Three Reasoning Modes
Tencent has released Hy3 Preview, a Mixture-of-Experts model offering 262,144 token context window and three configurable reasoning modes (disabled, low, high) for production agentic workflows. The model is available for free through OpenRouter.
Arcee AI Releases Trinity Large Preview: 400B-Parameter MoE Model with 512K Context Window
Arcee AI has released Trinity Large Preview, a 400B-parameter sparse Mixture-of-Experts model with 13B active parameters per token using 4-of-256 expert routing. The model supports context windows up to 512K tokens and is available with open weights under permissive licensing.
Xiaomi Launches MiMo-V2.5-Pro with 1M Context Window for Complex Agentic Tasks
Xiaomi released MiMo-V2.5-Pro on April 22, 2026, its flagship model featuring a 1,048,576 token context window and pricing at $1 per million input tokens and $3 per million output tokens. According to Xiaomi, the model ranks highly on ClawEval, GDPVal, and SWE-bench Pro benchmarks, designed for autonomous completion of professional tasks requiring thousands of tool calls.
GitHub halts Copilot Pro signups as agentic AI workloads overwhelm infrastructure
GitHub has paused new subscriptions for Copilot Pro, Pro+, and Student plans due to compute capacity constraints. The company cites agentic workflows as consuming significantly more resources than its original pricing structure anticipated, forcing tighter usage limits and a shift away from flat-rate billing.
OpenAI's Codex Desktop adds computer control and browser automation beyond coding
OpenAI's Codex Desktop can now control your computer, run background automations, and includes an in-app browser with click-to-select elements. The update adds automation memory across sessions and access to over 100 curated plugins, though the computer control feature is MacOS-only and unavailable in the EU.
Canva launches agentic AI assistant that automatically calls design tools from text prompts
Canva has released Canva AI 2.0, an agentic assistant that automatically calls design tools based on text prompts and creates editable layered designs. The update includes integrations with Slack, Gmail, Google Drive, Calendar, and Zoom for context building, plus web research and task scheduling capabilities.