model releaseOpenAI

Anthropic withholds Claude Mythos after finding thousands of OS vulnerabilities

TL;DR

Anthropic has announced Project Glasswing, restricting its new frontier model Claude Mythos Preview to defensive cybersecurity purposes through a coalition of 11 partners including AWS, Apple, Google, and Microsoft. The model has autonomously discovered thousands of high-severity vulnerabilities in major operating systems and web browsers—including a 27-year-old bug in OpenBSD and a 16-year-old vulnerability in FFmpeg—and can exploit them with 83.1% reliability on known vulnerabilities.

April 8, 2026 · 1:20 PM3 min read

Claude Mythos Preview — Quick Specs

Input$25/1M tokens

Output$125/1M tokens

Compare Claude Mythos Preview with other models →

Anthropic Withholds Claude Mythos After Finding Thousands of OS Vulnerabilities

Anthropic has taken the unprecedented step of restricting its new frontier model Claude Mythos Preview exclusively to defensive cybersecurity use, citing thousands of high-severity vulnerabilities the model has autonomously discovered in operating systems and web browsers. This marks the industry's first major withholding decision since OpenAI's GPT-2 announcement in 2019—but this time backed by concrete evidence.

Project Glasswing: A Controlled Deployment

The initiative, called Project Glasswing, deploys Claude Mythos Preview through a carefully curated coalition of 11 organizations: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks.

Anthropic is committing up to $100 million in usage credits and donating $4 million directly to open-source security organizations: $2.5 million to Alpha-Omega and OpenSSF through the Linux Foundation, and $1.5 million to the Apache Software Foundation. Over 40 additional organizations will receive access to scan critical software infrastructure.

After the initial credits expire, Mythos Preview will be available to authorized partners at $25 per million input tokens and $125 per million output tokens.

Proof: Decades-Old Vulnerabilities Found Autonomously

Unlike OpenAI's 2019 GPT-2 decision—which critics dismissed as a PR stunt—Anthropic is backing its restriction with concrete vulnerability discoveries:

OpenBSD TCP SACK Bug: Mythos Preview discovered a 27-year-old vulnerability in the TCP SACK implementation that allowed attackers to crash any OpenBSD machine through a remote connection. The bug exploited a subtle combination of missing validation and integer overflow.

FFmpeg H.264 Vulnerability: A 16-year-old vulnerability in the codec remained undetected despite automated testing hitting the affected code line five million times.

FreeBSD NFS Server (CVE-2026-4747): A 17-year-old vulnerability that Mythos Preview not only discovered but independently exploited.

Exploit Capability: The Real Concern

Mythos Preview's distinguishing feature isn't just vulnerability discovery—it's reliable exploitation. On a Firefox vulnerability benchmark with 147 known bugs, Mythos Preview achieved working exploits 181 times, compared to just 2 times for its predecessor Claude Opus 4.6.

On the CyberGym benchmark measuring reliable vulnerability reproduction in open-source software, Mythos Preview scored 83.1% versus 66.6% for Opus 4.6. In internal tests against 1,000 open-source projects, Mythos Preview achieved full control-flow hijack on ten fully patched targets—Opus 4.6 succeeded exactly once.

Broader Capability Improvements

Beyond cybersecurity, Mythos Preview shows significant advances:

SWE-bench Verified (software engineering): 93.9% vs. Opus 4.6's 80.8%
GPQA Diamond (graduate-level science): 94.6% vs. 91.3%
USAMO 2026 (US Mathematical Olympiad): 97.6% vs. 42.3%

Staged Deployment Strategy

Anthropic plans a deliberate rollout: first introducing necessary safeguards through Claude Opus 4.6, then gradually expanding access. Security professionals affected by restrictions can apply for the "Cyber Verification Program." Mythos-class models will become broadly available only after Anthropic refines safeguards on the less-risky Opus model first.

This approach echoes Jack Clark's 2019 testimony before Congress about staged releases—but inverted. Clark, who managed GPT-2's release and later co-founded Anthropic, outlined the principle that the research community could collectively manage risks. This time, Anthropic is betting that concentrating the most dangerous capabilities within a security-vetted coalition is the responsible approach.

What This Means

For the first time, an AI company is weaponizing capability to justify restriction rather than using restriction as a precaution. Anthropic has concrete evidence that its model poses genuine cybersecurity risks at scale. The coalition structure and financial commitments suggest this isn't performative safety theater but operational necessity.

This sets a new precedent: future frontier models with autonomous exploitation capabilities may face similar restrictions. The question for the industry is whether this model of controlled access around high-risk capabilities becomes standard practice or remains a one-off exception.

Source: the-decoder.com ↗

anthropic claude-mythos model-restriction cybersecurity vulnerability-discovery responsible-release frontier-models project-glasswing

model releaseJuly 1, 2026

Anthropic Restores Claude Fable 5 After Government Takedown, With Stricter Cybersecurity Blocks

Anthropic is redeploying Claude Fable 5 after a month-long government-mandated takedown triggered by Amazon researchers discovering a method to bypass the model's cybersecurity safeguards. The returning version includes enhanced safety classifiers that automatically block cybersecurity tasks and revert to Opus 4.8, with restricted availability through usage credits only.

model releaseJuly 1, 2026

Anthropic launches Claude Sonnet 5, restores Fable and Mythos models after 18-day US export control pause

Anthropic has launched Claude Sonnet 5 and restored access to its Fable and Mythos frontier models after an 18-day operational pause. The suspension began June 12 following a US government export control directive targeting the company's highest-capability systems.

model releaseJuly 1, 2026

Anthropic's Claude Fable 5 Returns After Trump Administration Lifts Export Controls

Anthropic announced it will begin restoring access to Claude Fable 5 after the Department of Commerce lifted export controls that had blocked foreign nationals from using the model since early June. The consumer-facing model, built on the same underlying technology as Mythos 5 but with additional safeguards, was sidelined following a Trump administration ultimatum over jailbreak concerns.

model releaseJune 30, 2026

Claude Sonnet 5 ships with 1M token context and new tokenizer that increases costs 30-40% for English text

Anthropic released Claude Sonnet 5 with a 1 million token context window and 128,000 token maximum output. The model removes traditional sampling parameters and introduces a new tokenizer that generates approximately 30% more tokens than Sonnet 4.6 for the same English text—effectively a significant price increase despite unchanged nominal rates of $3/million input and $15/million output tokens.

Anthropic withholds Claude Mythos after finding thousands of OS vulnerabilities

Claude Mythos Preview — Quick Specs

Anthropic Withholds Claude Mythos After Finding Thousands of OS Vulnerabilities

Project Glasswing: A Controlled Deployment

Proof: Decades-Old Vulnerabilities Found Autonomously

Exploit Capability: The Real Concern

Broader Capability Improvements

Staged Deployment Strategy

What This Means

Related Articles

Anthropic Restores Claude Fable 5 After Government Takedown, With Stricter Cybersecurity Blocks

Anthropic launches Claude Sonnet 5, restores Fable and Mythos models after 18-day US export control pause

Anthropic's Claude Fable 5 Returns After Trump Administration Lifts Export Controls

Claude Sonnet 5 ships with 1M token context and new tokenizer that increases costs 30-40% for English text

Comments