product updateAnthropic

Anthropic's Claude Mythos cybersecurity model accessed by unauthorized users for two weeks

TL;DR

Anthropic's Claude Mythos Preview, a cybersecurity AI model restricted to select companies including Nvidia, Google, and Microsoft, was accessed by unauthorized users starting April 7, 2025. The group obtained access through a third-party contractor and internet sleuthing techniques, according to Bloomberg.

2 min read
0

Anthropic's Claude Mythos cybersecurity model accessed by unauthorized users for two weeks

Anthropic's Claude Mythos Preview, a restricted AI model designed to identify and exploit security vulnerabilities, has been accessed by unauthorized users for approximately two weeks, according to Bloomberg. The company is investigating the breach, which occurred through a third-party vendor environment.

How the breach occurred

The unauthorized access began on April 7, 2025—the same day Anthropic announced Mythos would be released to a limited number of companies for testing. Members of a private Discord forum obtained access through a combination of tactics, including leveraging a third-party contractor's credentials and using publicly available information.

The group used data from a recent Mercor breach to make "an educated guess" about the model's online location based on knowledge of Anthropic's other model formats. Bloomberg reports that members provided screenshots and a live demonstration of the working model.

About Claude Mythos Preview

Claude Mythos Preview is described by Anthropic as a general-purpose model capable of identifying and exploiting vulnerabilities "in every major operating system and every major web browser when directed by a user to do so." Official access is limited to select companies through the Project Glasswing initiative, including Nvidia, Google, Amazon Web Services, Apple, and Microsoft. Multiple governments are also evaluating the technology.

Anthropic has stated it has no plans to release the model publicly due to concerns about weaponization.

Company response

"We're investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments," an Anthropic spokesperson told Bloomberg. The company claims it currently has no evidence that the breach extends beyond the third-party vendor's environment or is impacting Anthropic's own systems.

According to Bloomberg, the unauthorized users have been using Mythos regularly since gaining access, though reportedly avoiding cybersecurity-related queries to evade detection. The group has also accessed other unreleased Anthropic models.

What this means

This breach highlights the persistent challenge of restricting access to powerful AI models, even when companies implement strict access controls. The incident occurred through a third-party contractor—a common vulnerability in enterprise security—and demonstrates that determined actors can exploit indirect access points to restricted systems. The fact that the group avoided using the model's core cybersecurity capabilities suggests they understood detection risks, but their ability to maintain access for two weeks raises questions about monitoring and access controls at AI companies deploying high-risk models. This incident may influence how Anthropic and other AI labs structure access to sensitive models going forward.

Comments

Loading...