Anthropic's Claude Mythos can find zero-day exploits faster than defenders can patch them
Anthropic announced Claude Mythos Preview, a new frontier model with advanced reasoning capabilities that can identify and chain together multiple vulnerabilities into novel attacks—abilities the company says outpace current defensive capabilities. The model has already discovered thousands of high-severity vulnerabilities including a 27-year-old OpenBSD flaw and exploits for multiple operating systems. To manage the risk, Anthropic launched Project Glasswing, granting early access to 40+ companies including Apple, Google, Microsoft, and Cisco, providing $100M in usage credits for defensive security work.
Claude Mythos Preview — Quick Specs
Anthropic's Claude Mythos Can Find Zero-Day Exploits Faster Than Defenders Can Patch Them
AnthropIC announced Claude Mythos Preview on Tuesday after accidentally leaking its existence two weeks prior—a frontier AI model the company explicitly states poses serious new cybersecurity risks that outpace current defensive capabilities.
The model represents a significant capability leap. According to Anthropic's announcement, Mythos achieves 93.9% on SWE-bench Verified, a 13-percentage-point improvement over Claude Opus 4.6's 80.8% score. This performance gain comes directly from improvements in reasoning—the same general capability improvements every AI lab is pursuing—rather than specialized cyber training.
What Mythos Can Already Do
Mythos has identified thousands of high-severity vulnerabilities across major operating systems and web browsers, including:
- A vulnerability in OpenBSD that evaded detection for 27 years
- A flaw in FFmpeg video encoder that survived 5 million automated tests
- Multiple Linux kernel vulnerabilities that could enable complete machine compromise
Crucially, the model can chain together separate vulnerabilities into novel attacks—a capability current models lack. Combined with AI systems' growing ability to operate without human supervision for extended periods, researchers say this represents an inflection point in cybersecurity risk.
Project Glasswing: Controlled Deployment
Rather than releasing Mythos broadly, Anthropic launched Project Glasswing, a coalition of 40+ companies including Apple, Google, Microsoft, Cisco, and Broadcom. Participants receive $100 million in model usage credits to scan and patch vulnerabilities in their own systems and critical open-source infrastructure. Anthropic is also donating $4 million to open-source security initiatives.
Alex Stamos, chief product officer at Corridor and former security lead at Facebook and Yahoo, called Glasswing "a big deal, and really necessary." He warned that "open-weight models will catch up to foundation models in bug finding within six months, at which point every ransomware actor will be able to find and weaponize bugs without leaving traces."
Cisco's chief security officer Anthony Grieco stated: "AI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure from cyber threats, and there is no going back."
The Uncomfortable Premise
Glasswing rests on a deeply uncomfortable foundation: the only way to defend against dangerous AI capabilities is for a safety-focused lab to build them first. This centralizes both power and risk. Anthropic now possesses zero-day exploits for virtually all major software systems—a capability whose theft would represent a severe security threat.
The timing is awkward. The Trump administration is resisting AI regulation while the US government previously attempted to declare Anthropic a supply chain risk after it refused to include mass domestic surveillance and autonomous weapons in Pentagon contracts. Anthropic briefed senior government officials including CISA before the announcement but faces unclear government interest in collaboration.
Timeline Uncertainty
Stamos offered two scenarios: an optimistic timeline where superhuman capabilities find a finite, patchable set of flaws, and a pessimistic one where each model release discovers new classes of vulnerabilities we "never even imagined." The actual outcome remains unknowable.
What this means
Mythos validates Anthropic's founding thesis—that a safety-focused lab building frontier models could discover dangerous capabilities first and lead mitigation efforts. However, it also demonstrates that unregulated AI development can create genuine national security risks when capability gains in general reasoning translate directly into cybersecurity threats. The race between Anthropic's responsible disclosure through Glasswing and the inevitable proliferation of these capabilities to open-weight models may determine whether critical infrastructure survives the next two years of AI progress intact.
Related Articles
Anthropic Python SDK v0.104.0 adds thinking token count estimates for streaming responses
Anthropic released version 0.104.0 of its Python SDK on May 21, 2026. The update adds support for a thinking-token-count beta feature that provides estimated token counts in thinking block deltas when streaming responses from reasoning models.
Google releases Gemini Omni Flash video generation model with conversational editing, withholds speech synthesis
Google DeepMind released Gemini Omni Flash, the first model in its new Omni family that generates and edits video from image, audio, video, and text inputs. The model is rolling out to Gemini app subscribers and YouTube Shorts with a 10-second clip limit, while speech-editing capabilities remain withheld pending safety testing.
Anthropic adds MCP tunnels and self-hosted sandboxes to Claude Managed Agents for enterprise security
Anthropic has added two enterprise security features to Claude Managed Agents: MCP tunnels, which route agent services through private networks without public internet exposure, and self-hosted sandboxes, which keep sensitive tool execution within customer infrastructure while Anthropic handles orchestration.
NVIDIA releases Nemotron-Labs-Diffusion-14B with tri-mode decoding achieving 3.3x speed-up on GB200
NVIDIA released Nemotron-Labs-Diffusion-14B, a 14-billion parameter language model that supports three decoding modes by switching attention patterns during inference. The model achieves 850 tokens per second on GB200 hardware at concurrency 1, representing a 3.3x speed-up over standard autoregressive decoding and outperforming Qwen3-8B-Eagle3 by 2.2x in self-speculation mode.
Comments
Loading...