product updateAnthropic

Anthropic's Claude Code Auto Mode enables automatic execution of safe commands while blocking risky actions

TL;DR

Anthropic has released Auto Mode for Claude Code, a middle-ground safety feature that automatically executes safe local operations while blocking risky actions like external deployments and mass deletions. A Claude Sonnet 4.6 classifier evaluates each command based on conversation context, and the system reverts to manual approval after three consecutive blocks or twenty total blocks. The feature is available as a research preview for Team plan users, with Enterprise and API access expected shortly.

March 25, 2026 · 10:05 AM2 min read

Anthropic's Claude Code Auto Mode Balances Developer Workflow Against Safety Risks

Anthropric has introduced Auto Mode for Claude Code, a new safety feature designed to address a longstanding friction point: developers must choose between approving every action manually or disabling all safety checks entirely.

Claude Code executes shell commands, deletes files, creates directories, and pushes commits to GitHub. The default behavior requires manual approval before potentially risky actions, which protects against damage but severely disrupts workflow. Many developers resort to the "dangerously-skip-permissions" flag, which removes all safety checks and can lead to "dangerous and destructive outcomes," according to Anthropic.

How Auto Mode Works

Auto Mode introduces a classifier running on Claude Sonnet 4.6 that evaluates every command before execution. The classifier distinguishes between safe and risky operations based on conversation context:

Automatically executed:

Local file operations within the working directory
Installing pre-declared dependencies
Read-only HTTP requests

Blocked by default:

Downloading and executing external scripts
Sending sensitive data to external endpoints
Production deployments
Mass deletions on cloud storage
Force pushes to repositories

When the classifier blocks an action, Claude attempts to find an alternative approach. If blocking occurs three times consecutively or twenty times total during a session, the system switches back to manual approval mode.

Anthropric deliberately designed the classifier to not see tool results from executed commands. This prevents malicious content in files or web pages from manipulating the classifier's decision-making.

Acknowledging Residual Risk

Anthropric emphasizes that Auto Mode reduces risk but does not eliminate it. The classifier can incorrectly allow risky actions when context is ambiguous or unnecessarily block harmless operations. The company continues to recommend running Claude Code in sandboxed environments for additional protection.

Availability and Rollout

Auto Mode is currently available as a research preview for Claude Code Team plan users, compatible with both Sonnet 4.6 and Opus 4.6 models. Enterprise and API access are expected to follow in the coming days.

What this means

Auto Mode addresses a genuine usability problem in AI-assisted development tools: the binary choice between friction and risk. By introducing context-aware automation with safety guardrails and fallback mechanisms, Anthropic offers developers a more practical workflow without sacrificing oversight entirely. However, the residual risk and requirement for continued sandboxing indicate this is not a complete solution—it's an incremental improvement that shifts the security model rather than fundamentally solving the underlying tension between autonomy and safety.

Source: the-decoder.com ↗

claude-code anthropic ai-safety code-generation product-update automation safety-classifier

product updateJune 23, 2026

Anthropic launches Claude Tag for Slack, writes 65% of its product team's code

Anthropic released Claude Tag, a beta feature that integrates Claude into Slack for Enterprise and Team customers. The company says the tool writes 65% of its product team's code and can work proactively with ambient mode enabled.

product updateJune 23, 2026

OpenAI releases GPT-5.5-Cyber with 85.6% CyberGym score, surpassing restricted Anthropic model

OpenAI released an updated GPT-5.5-Cyber model that scores 85.6% on CyberGym, surpassing Anthropic's Mythos 5 (83.8%) — the same model that triggered Trump administration export controls. The release proceeds without the political pushback that forced Anthropic to restrict foreign national access.

product updateJune 23, 2026

Anthropic launches Claude Tag for Slack: AI agent with persistent memory across team channels

Anthropic has released Claude Tag in research preview for Slack, an AI agent that maintains persistent memory across channels and can proactively participate in team conversations. Available to Claude Enterprise and Team customers, it differs from existing Slack integrations by learning organizational context over time and sharing a single identity across team members.

product updateJune 23, 2026

Claude API and web services restored after 35-minute outage affecting Sonnet and Opus models

Anthropic's Claude services went offline on June 23 at 10:19 AM ET, affecting most models including Sonnet and Opus across all platforms except Claude for Government. The company deployed a fix by 10:53 AM ET, ending an outage that lasted approximately 35 minutes.

Anthropic's Claude Code Auto Mode enables automatic execution of safe commands while blocking risky actions

Anthropic's Claude Code Auto Mode Balances Developer Workflow Against Safety Risks

How Auto Mode Works

Acknowledging Residual Risk

Availability and Rollout

What this means

Related Articles

Anthropic launches Claude Tag for Slack, writes 65% of its product team's code

OpenAI releases GPT-5.5-Cyber with 85.6% CyberGym score, surpassing restricted Anthropic model

Anthropic launches Claude Tag for Slack: AI agent with persistent memory across team channels

Claude API and web services restored after 35-minute outage affecting Sonnet and Opus models

Comments