product updateAnthropic

Anthropic launches Claude Code 'auto mode' with AI-powered permission classifier

TL;DR

Anthropic has released 'auto mode' for Claude Code, a permissions system that sits between conservative defaults and fully disabled safeguards. The feature uses a classifier to automatically approve safe actions like file writes and bash commands while blocking potentially destructive operations.

2 min read
0

Anthropic Launches Claude Code 'Auto Mode' With AI-Powered Permission Classifier

Anthropric rolled out "auto mode" for Claude Code on March 24, 2026, introducing a new permissions framework that balances developer convenience against safety risks.

The feature addresses a usability problem: Claude Code's default configuration requires explicit user approval before executing each file write or bash command. Developers seeking faster execution have historically disabled all permissions using the --dangerously-skip-permissions flag, creating significant security exposure.

Auto mode introduces a middle path using a machine learning classifier that pre-screens each tool invocation before execution. The classifier identifies potentially destructive actions—including mass file deletion, sensitive data exfiltration, and malicious code patterns—and blocks them automatically. Actions deemed safe proceed without user interruption. If Claude repeatedly attempts blocked actions, the system escalates to a user permission prompt.

Anthropric explicitly notes that auto mode reduces risk compared to fully disabled permissions but does not eliminate it entirely. The company recommends using auto mode exclusively in isolated development environments.

Rollout Timeline

Claude Teams users gained access to auto mode as a research preview on March 24. Enterprise and API customers will receive access within days, according to Anthropic's announcement.

This update follows Anthropic's unveiling of a separate research preview feature that enables Claude to control macOS directly—another capability gated behind safety controls.

What This Means

Auto mode addresses a genuine friction point in AI-assisted development: the trade-off between safety guardrails and operational efficiency. By delegating routine safety checks to an ML classifier, Anthropic reduces manual approval overhead while maintaining the ability to catch genuinely dangerous operations. However, the existence of a classifier that can be circumvented introduces new attack surface—adversarial prompts could potentially exploit classification boundaries. The elevation to manual permission prompts when Claude insists on blocked actions suggests the system relies on Claude's own behavior modification rather than hard technical barriers, which may be bypassable. Enterprise adoption will likely depend on how well the classifier generalizes to production codebases with domain-specific patterns.

Related Articles

product update

Anthropic adds always-on channels to Claude Code, enabling async AI agent capabilities

Anthropic has added "channels" to Claude Code, enabling Claude to respond to incoming messages, webhooks, and notifications asynchronously without user intervention. The research preview supports Telegram and Discord with custom channel support, running through MCP servers with two-way communication.

product update

Anthropic's Claude Code gets auto-execution mode with built-in safety checks

Anthropic has released auto mode for Claude Code in research preview, enabling the AI to execute actions it deems safe without waiting for user approval. The feature uses built-in safeguards to block risky actions and prompt injection attacks, while automatically proceeding with safe operations.

product update

OpenAI releases open-source teen safety prompts for developers

OpenAI is releasing a set of open-source prompts developers can use to make their applications safer for teens. The policies, designed to work with OpenAI's gpt-oss-safeguard model, address graphic violence, sexual content, harmful body ideals, dangerous activities, and age-restricted goods.

product update

Anthropic's Claude gains computer control in Code and Cowork tools

Anthropic has expanded Claude's autonomous capabilities to its Code and Cowork AI tools, allowing the model to control your Mac's mouse, keyboard, and display to complete tasks without manual intervention. The research preview is available now for Claude Pro and Max subscribers on macOS only, with support for other operating systems coming later.

Comments

Loading...