researchAnthropic

Anthropic's Mythos AI generates working zero-day exploits 72.4% of the time, won't release publicly

TL;DR

Anthropic has developed Mythos, an AI model capable of generating working zero-day exploits with a 72.4% success rate, compared to Claude Opus 4.6's near-zero capability. The company declined public release due to security risks and instead created Project Glasswing, a limited-access program for 40+ organizations including AWS, Apple, Google, and Microsoft to find vulnerabilities in their own systems.

2 min read
0

Anthropic's Mythos AI Generates Working Zero-Day Exploits at 72.4% Success Rate

Anthropric has developed an AI model called Mythos that can autonomously identify and exploit zero-day vulnerabilities across major operating systems and web browsers, demonstrating exploit development success rates that far exceed previous AI capabilities.

The Numbers

Mythos Preview achieves a 72.4% success rate in generating working exploits, a stark contrast to Claude Opus 4.6, which Anthropic stated had "just over zero percent" exploit development success. The model can construct complex, multi-stage exploits: in one instance, it chained four separate vulnerabilities to create a web browser exploit with a JIT heap spray that escaped both renderer and OS sandboxes. It has also autonomously developed local privilege escalation exploits on Linux using subtle race conditions and KASLR-bypasses, and created remote code execution exploits on FreeBSD's NFS server with ROP chains spanning multiple packets.

During internal testing, Mythos identified "thousands of additional high- and critical-severity vulnerabilities" across systems. The model demonstrated capability against vulnerabilities spanning decades—including a patched 27-year-old bug in OpenBSD—requiring no formal security training from operators.

Why It's Not Public

Anthropic explicitly chose not to release Mythos publicly, with researchers stating that doing so "would break the internet – in a bad way." The company released a research post authored by 22 Anthropic researchers on Tuesday detailing the model's capabilities, but access remains restricted.

Instead, Anthropic created Project Glasswing, a limited-access initiative allowing approximately 40 organizations to use Mythos Preview for defensive purposes. The core group includes: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Participating organizations receive up to $100 million in Mythos Preview usage credits and $4 million in direct funding to open-source security projects.

What Mythos Can Do

According to Anthropic's research, the model exhibits broad capability across software vulnerabilities. One case study describes an engineer with no formal security training requesting Mythos to find remote code execution vulnerabilities overnight—and finding complete, working exploits by morning. The exploits move beyond basic stack-smashing techniques to sophisticated attacks that chain multiple vulnerabilities, escape sandbox protections, and bypass modern security mitigations.

The company is conducting responsible disclosure of the vulnerabilities Mythos identified. This process remains ongoing.

What This Means

Mythos represents a significant inflection point in AI-assisted security research. The gap between exploit-generation capability (72.4%) and prior AI models (near 0%) suggests AI has crossed a threshold where it can autonomously perform tasks previously requiring specialized human expertise. Anthropic's decision to limit access rather than publish represents a pragmatic security stance—similar to how quantum computing capabilities are discussed openly while actual systems remain restricted. The 40-organization preview model creates an asymmetric advantage for well-resourced companies to patch vulnerabilities before adversaries can use similar models, but raises questions about whether this gap can persist long-term as AI capabilities diffuse. The $100 million subsidy for defensive use underscores Anthropic's acknowledgment that this technology's release could destabilize critical infrastructure.

Related Articles

model release

Anthropic's Claude Mythos can find zero-day exploits faster than defenders can patch them

Anthropic announced Claude Mythos Preview, a new frontier model with advanced reasoning capabilities that can identify and chain together multiple vulnerabilities into novel attacks—abilities the company says outpace current defensive capabilities. The model has already discovered thousands of high-severity vulnerabilities including a 27-year-old OpenBSD flaw and exploits for multiple operating systems. To manage the risk, Anthropic launched Project Glasswing, granting early access to 40+ companies including Apple, Google, Microsoft, and Cisco, providing $100M in usage credits for defensive security work.

model release

Anthropic restricts Claude Mythos to security researchers under Project Glasswing

Anthropic has not publicly released Claude Mythos, instead restricting access to a vetted set of partners through Project Glasswing. The company claims the model's cybersecurity research abilities—including finding thousands of high-severity vulnerabilities in major operating systems and browsers—warrant controlled deployment until industry safeguards mature.

model release

Anthropic previews Mythos, claims it found thousands of zero-day vulnerabilities in cybersecurity initiative

Anthropic unveiled a preview of Mythos, a frontier model it claims is the most powerful in its Claude lineup, for use in Project Glasswing—a cybersecurity initiative with 40+ partner organizations. According to Anthropic, Mythos identified thousands of zero-day vulnerabilities, many critical and up to two decades old, during early testing. The model will not be made generally available and is restricted to defensive security work by vetted partners.

product update

Anthropic launches Project Glasswing to defend critical software against AI-powered attacks

Anthropic has announced Project Glasswing, a new initiative to secure critical software infrastructure against AI-powered attacks. The project includes 11 major partners including Amazon, Apple, Google, Microsoft, and NVIDIA, and will use Claude Mythos Preview, an unreleased general-purpose model from Anthropic that claims to have found thousands of exploitable vulnerabilities across major operating systems and web browsers.

Comments

Loading...