safety-classifier
1 article tagged with safety-classifier
March 25, 2026
product updateAnthropic
Anthropic's Claude Code Auto Mode enables automatic execution of safe commands while blocking risky actions
Anthropic has released Auto Mode for Claude Code, a middle-ground safety feature that automatically executes safe local operations while blocking risky actions like external deployments and mass deletions. A Claude Sonnet 4.6 classifier evaluates each command based on conversation context, and the system reverts to manual approval after three consecutive blocks or twenty total blocks. The feature is available as a research preview for Team plan users, with Enterprise and API access expected shortly.