safety-classifier

1 article tagged with safety-classifier

March 25, 2026

product updateAnthropic

Anthropic's Claude Code Auto Mode enables automatic execution of safe commands while blocking risky actions

Anthropic has released Auto Mode for Claude Code, a middle-ground safety feature that automatically executes safe local operations while blocking risky actions like external deployments and mass deletions. A Claude Sonnet 4.6 classifier evaluates each command based on conversation context, and the system reverts to manual approval after three consecutive blocks or twenty total blocks. The feature is available as a research preview for Team plan users, with Enterprise and API access expected shortly.

March 25, 2026 · 10:05 AM

← Back to all news