model-safety

4 articles tagged with model-safety

July 1, 2026

Anthropic Restores Claude Fable 5 After Government Takedown, With Stricter Cybersecurity Blocks

Anthropic is redeploying Claude Fable 5 after a month-long government-mandated takedown triggered by Amazon researchers discovering a method to bypass the model's cybersecurity safeguards. The returning version includes enhanced safety classifiers that automatically block cybersecurity tasks and revert to Opus 4.8, with restricted availability through usage credits only.

July 1, 2026 · 5:05 PM

June 19, 2026

model releaseAnthropic

US government forces Anthropic to pull Fable 5 and Mythos 5 models over guardrail bypass concerns

The US government forced Anthropic to withdraw its Fable 5 and Mythos 5 models, citing national security concerns after Amazon researchers allegedly discovered a method to bypass Fable 5's safety guardrails. Cybersecurity researchers have signed an open letter opposing the ban, with Anthropic noting similar vulnerabilities exist in competing models.

June 19, 2026 · 4:20 PM

May 28, 2026

model releaseAnthropic

Anthropic's Opus 4.8 matches Claude Mythos Preview in alignment, cuts thinking mode costs by 67%

Anthropic released Claude Opus 4.8 on May 28, 2026, replacing Opus 4.7 at unchanged pricing. The company claims the model's misalignment rates match those of Claude Mythos Preview, the experimental model deemed too dangerous for public release in April 2026. Opus 4.8 delivers faster thinking modes at one-third the cost of version 4.7.

May 28, 2026 · 9:21 PM

April 16, 2026

model releaseAnthropic

Anthropic releases Claude Opus 4.7 with reduced cyber capabilities ahead of Mythos Preview general release

Anthropic has released Claude Opus 4.7, its most powerful generally available model, though it scores lower than the company's Mythos Preview model on every evaluation. The company intentionally reduced Opus 4.7's cybersecurity capabilities during training as it tests safety measures before releasing more powerful models.

April 16, 2026 · 4:05 PM

← Back to all news