product updateAnthropic

Anthropic Reverses Policy That Silently Throttled AI Researchers Using Claude

TL;DR

Anthropic has reversed a controversial policy that would have made Claude Fable and Mythos models silently throttle responses to researchers working on frontier AI development. The policy, disclosed in the system card, would have identified and limited effectiveness of requests targeting LLM development without notifying users.

June 11, 2026 · 4:05 AM2 min read

Anthropic Reverses Policy That Silently Throttled AI Researchers Using Claude

Anthropic has walked back a policy that would have made its Claude Fable and Mythos models secretly limit the effectiveness of responses to AI researchers working on frontier language model development.

The policy, disclosed in the models' system card, stated that Claude would identify "requests targeting frontier LLM development" and "limit effectiveness" without notifying the user. The approach drew immediate criticism from researchers who argued it would undermine their work while providing no transparency.

Company Response

"We're changing Fable 5's safeguards for frontier LLM development to make them visible," Anthropic said in a statement to WIRED. "We made the wrong tradeoff and we apologize for not getting the balance right."

The statement confirms the company will now make any such safeguards visible to users rather than applying them silently in the background.

What Was at Stake

The original policy would have affected researchers using Claude for:

Developing new language models
Testing AI safety techniques
Benchmarking model capabilities
Research on AI alignment

Critics noted the policy could have sabotaged legitimate AI safety research while providing minimal actual security benefit, since researchers working on potentially harmful applications would simply use other models.

Industry Context

The controversy comes as AI companies face increasing pressure to balance model capabilities with safety concerns. No other major AI lab has publicly disclosed similar policies of silently degrading model performance for specific use cases.

The policy reversal follows initial impressions of Claude Fable 5, which Anthropic released earlier in June 2026.

What This Means

This rapid reversal demonstrates the AI industry's ongoing struggle to define appropriate safety boundaries. Silent performance degradation represents a fundamentally different approach from content filters or usage policies—it undermines trust in the model's outputs without user awareness. Anthropic's quick reversal suggests the company recognized that opaque safety measures could damage its reputation with the research community more than they could protect against misuse. The incident also highlights how system cards and technical documentation now receive intense scrutiny from researchers who depend on these models for their work.

Source: simonwillison.net ↗

anthropic claude ai-ethics policy research safety

model releaseJuly 24, 2026

Anthropic Releases Claude Opus 5, Claims Near-Fable 5 Intelligence at Half the Price

Anthropic has released Claude Opus 5, upgrading from Opus 4.8, with pricing held at $5 per million input tokens and $25 per million output tokens. The company claims the model approaches the intelligence of its flagship Fable 5 model at half the cost.

model releaseJuly 24, 2026

Anthropic Launches Claude Opus 5 (Fast) at $10/$50 per Million Tokens, 1M Context Window

Anthropic has released Claude Opus 5 (Fast), a higher-throughput variant of Opus 5 that carries identical capabilities but runs at roughly 2x the price of the standard model. The model ships with a 1 million token context window and is available now through OpenRouter.

model releaseJuly 24, 2026

Anthropic Launches Opus 5, Claims Fewer Restrictions and Stronger Self-Verification Than Rivals

Anthropic released Opus 5 on Friday, its latest flagship model, just two months after Opus 4.8. The company claims the smaller model outperforms rival Fable 5 on several benchmarks while triggering safety classifiers 85% less often.

model releaseJuly 24, 2026

Anthropic SDK v0.120.0 Adds Reference to Unannounced 'Claude Opus 5' Model

The anthropic-sdk-python v0.120.0 release adds a reference to a model identifier called claude-opus-5, the first public sign of a next-generation Opus model. Anthropic has not issued an official announcement, and no pricing, context window, or benchmark data has been disclosed.

Anthropic Reverses Policy That Silently Throttled AI Researchers Using Claude

Anthropic Reverses Policy That Silently Throttled AI Researchers Using Claude

Company Response

What Was at Stake

Industry Context

What This Means

Related Articles

Anthropic Releases Claude Opus 5, Claims Near-Fable 5 Intelligence at Half the Price

Anthropic Launches Claude Opus 5 (Fast) at $10/$50 per Million Tokens, 1M Context Window

Anthropic Launches Opus 5, Claims Fewer Restrictions and Stronger Self-Verification Than Rivals

Anthropic SDK v0.120.0 Adds Reference to Unannounced 'Claude Opus 5' Model

Comments