Anthropic withholds Mythos Preview model due to advanced hacking capabilities
Anthropic is rolling out its Mythos Preview model only to a handpicked group of 40 tech and cybersecurity companies, withholding public release due to the model's sophisticated ability to find tens of thousands of vulnerabilities and autonomously create working exploits. The model found bugs in every major operating system and web browser during testing, including vulnerabilities decades old and undetected by human security researchers.
Anthropic Withholds Mythos Preview Model Due to Advanced Hacking Capabilities
Anthropiс is limiting Mythos Preview access to approximately 40 organizations over concerns about its ability to find and exploit security vulnerabilities at scale, the company announced Tuesday.
The Threat
Mythos Preview is "extremely autonomous" with sophisticated reasoning capabilities that match advanced security researchers, according to Logan Graham, head of Anthropic's frontier red team.
The model can discover "tens of thousands of vulnerabilities" — substantially exceeding Opus 4.6, Anthropic's last public model, which found approximately 500 zero-days in open-source software. Critically, Mythos Preview not only identifies vulnerabilities but also autonomously writes working exploits.
In testing, Mythos Preview successfully reproduced vulnerabilities and created proof-of-concept exploits on the first attempt in 83.1% of cases. The model discovered bugs in every major operating system and web browser, including vulnerabilities believed to be decades old that evaded repeated human-conducted security tests.
Critical Findings
Specific discoveries demonstrate the severity:
- Linux kernel: Mythos Preview identified multiple flaws in the Linux kernel and autonomously chained them together, enabling complete system compromise on any machine running Linux.
- OpenBSD: A 27-year-old vulnerability allowing remote denial-of-service attacks on OpenBSD systems, widely considered one of the most security-hardened open-source projects and deployed in firewalls, routers, and high-security servers.
Controlled Distribution Strategy
Instead of public release, Anthropic is deploying Mythos Preview to 40+ organizations for defensive security purposes. Twelve companies — Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks — are participating in Project Glasswing, a new initiative focused on scanning and securing infrastructure.
Anthropiс is providing up to $100 million in usage credits to participating companies and $4 million to open-source security organizations including OpenSSF, Alpha-Omega, and the Apache Software Foundation.
Competitive Timeline
Graham stated that competing AI companies will likely release models with similar capabilities within 6 to 18 months. OpenAI and other major technology companies are reportedly already developing comparable systems.
"More powerful models are going to come from us and from others, and so we do need a plan to respond to this," Anthropic CEO Dario Amodei said in an accompanying video.
Government Engagement
Anthropiс has briefed the Cybersecurity and Infrastructure Security Agency (CISA), the Commerce Department, and other governmental actors on Mythos Preview's risks and potential benefits. The company declined to confirm whether it has briefed the Pentagon, with which it has been in dispute for months.
Future Deployment
Anthropiс's stated goal is to eventually enable safe deployment of Mythos-class models at scale, including for general use cases beyond cybersecurity. The company plans to develop new safeguards on less-powerful Opus models to "improve and refine them with a model that does not pose the same level of risk as Mythos Preview."
What This Means
Anthropiс's decision reflects genuine tension between advancing AI capabilities and catastrophic security risks. By restricting Mythos Preview while actively briefing government agencies, the company is attempting to shift the security industry's defensive posture before similar capabilities become widely available. However, the 6-18 month timeline until competitors release comparable models creates a narrow window for the security industry to adapt. The initiative demonstrates that frontier AI capabilities may require controlled distribution and government coordination rather than open release, establishing potential precedent for future high-risk model deployments.
Related Articles
Anthropic unveils Claude Mythos model, finds thousands of OS vulnerabilities via Project Glasswing
Anthropic has unveiled Claude Mythos, a new AI model designed for cybersecurity that has already discovered thousands of high-severity vulnerabilities in every major operating system and web browser. The model is being distributed as a preview to over 40 organizations and major technology partners including Apple, Google, Microsoft, and Amazon Web Services through Project Glasswing, a coordinated cybersecurity initiative.
Anthropic previews Mythos, claims it found thousands of zero-day vulnerabilities in cybersecurity initiative
Anthropic unveiled a preview of Mythos, a frontier model it claims is the most powerful in its Claude lineup, for use in Project Glasswing—a cybersecurity initiative with 40+ partner organizations. According to Anthropic, Mythos identified thousands of zero-day vulnerabilities, many critical and up to two decades old, during early testing. The model will not be made generally available and is restricted to defensive security work by vetted partners.
Anthropic Python SDK v0.90.0 adds Claude Mythos preview support
Anthropic released version 0.90.0 of its Python SDK on April 7, 2026, adding support for the claude-mythos-preview model. The update also includes a bug fix for query parameter merging in the client.
GLM-5.1 achieves 58.4% on SWE-Bench Pro with sustained agentic reasoning over hundreds of iterations
Zhipu AI has released GLM-5.1, a 754-billion parameter model designed for agentic engineering with significantly improved coding capabilities over its predecessor. The model achieves 58.4% on SWE-Bench Pro and demonstrates sustained performance improvement over hundreds of tool calls and iterations, unlike earlier models that plateau quickly.
Comments
Loading...