Anthropic's Claude Code gets auto-execution mode with built-in safety checks
Anthropic has released auto mode for Claude Code in research preview, enabling the AI to execute actions it deems safe without waiting for user approval. The feature uses built-in safeguards to block risky actions and prompt injection attacks, while automatically proceeding with safe operations.
Anthropic has introduced auto mode for Claude Code, a research preview feature that shifts decision-making authority from users to the AI itself—but with safety guardrails built in.
The feature addresses a friction point in current "vibe coding" workflows: developers must either babysit every action Claude takes or disable all oversight entirely. Auto mode attempts to find middle ground by letting Claude automatically execute actions it determines are safe, while flagging and blocking risky operations.
How It Works
Auto mode uses AI-powered safety checks before executing each action. The system screens for two primary threats:
- Risky behavior — actions the user didn't explicitly request
- Prompt injection attacks — malicious instructions hidden in content that could cause unintended AI behavior
Actions passing these checks proceed automatically. Those flagged as risky are blocked. The feature extends Claude Code's existing "dangerously-skip-permissions" command, which handed all decision-making to the AI, but now adds a safety layer on top.
Availability and Limitations
Auto mode is rolling out to Enterprise and API users in the coming days. Anthropic currently limits the feature to Claude Sonnet 4.6 and Opus 4.6. The company strongly recommends using it only in "isolated environments"—sandboxed setups kept separate from production systems to minimize potential damage if safety checks fail.
Anthropologic has not disclosed the specific criteria its safety layer uses to distinguish safe actions from risky ones, a detail developers will likely want clarification on before widespread adoption.
Broader Context
Auto mode reflects an industry-wide shift toward agentic AI tools that execute tasks without constant human intervention. Competitors including GitHub and OpenAI have launched autonomous coding tools with similar capabilities. Anthropic's distinguishing element is delegating the permission-decision itself to the AI, rather than requiring human approval gates.
The launch follows Anthropic's recent releases of Claude Code Review (automatic bug detection) and Dispatch for Cowork (task delegation to AI agents).
What This Means
Auto mode represents Anthropic's bet that AI systems can safely self-govern when properly constrained. The feature trades some user control for developer velocity—a calculation that works only if safety checks are genuinely reliable. The research preview designation and recommended sandbox-only use suggest Anthropic expects iterative refinement. For enterprise users seeking faster coding workflows, auto mode reduces friction; for those prioritizing maximum oversight, it remains optional. The undefined safety criteria, however, leaves a significant transparency gap that could slow adoption until Anthropic provides more technical detail.
Related Articles
Anthropic's Mythos model finds thousands of high-severity bugs in Firefox, including 15-year-old vulnerabilities
Mozilla's Firefox team reports that Anthropic's Mythos model has discovered thousands of high-severity security vulnerabilities, including bugs that had remained undetected for more than 15 years. In April 2026, Firefox shipped 423 bug fixes compared to just 31 in April 2025, marking a 13x increase attributed to AI-assisted vulnerability detection.
Anthropic adds dreaming, outcomes, and multiagent orchestration to Claude Managed Agents
Anthropic has released three new capabilities for Claude Managed Agents: dreaming (research preview) for pattern recognition and self-improvement, outcomes for defining success criteria with automated evaluation, and multiagent orchestration for delegating tasks to specialist agents.
Anthropic Doubles Claude Code Rate Limits, Secures 300+ MW Compute from SpaceX's Colossus 1
Anthropic has secured access to all compute capacity at SpaceX's Colossus 1 data center, adding more than 300 megawatts of new capacity within the month. As a result, the company is doubling five-hour rate limits for paid Claude Code users and removing peak hour restrictions for Pro and Max tiers.
Anthropic doubles Claude Code usage limits for paid users, increases API capacity by up to 1500%
Anthropic has doubled Claude Code's five-hour usage limits for Pro, Max, Team, and Enterprise users while removing peak hour restrictions for Pro and Max plans. The company also increased API limits by up to 1500% for input tokens per minute through a compute capacity deal with SpaceX's Colossus 1 data center.
Comments
Loading...