benchmarkAnthropic

Claude Mythos achieves 73% success rate on expert-level hacking challenges, completes full network takeover in 3 of 10 a

TL;DR

The UK's AI Safety Institute reports Claude Mythos Preview achieved a 73% success rate on expert-level capture-the-flag cybersecurity challenges and became the first AI model to complete a full 32-step simulated corporate network takeover, succeeding in 3 out of 10 attempts. The testing occurred in environments without active security monitoring or defenders.

3 min read
0

Claude Mythos achieves 73% success rate on expert-level hacking challenges, completes full network takeover in 3 of 10 attempts

The UK's AI Safety Institute (AISI) reports that Anthropic's Claude Mythos Preview achieved a 73% success rate on expert-level cybersecurity challenges and became the first AI model to autonomously complete a full simulated corporate network attack from start to finish.

Benchmark performance

In capture-the-flag (CTF) evaluations with a 50 million token compute budget, Mythos Preview scored:

  • 85% on apprentice-level tasks
  • 95% on beginner-level tasks
  • 93% on practitioner-level tasks
  • 73% on expert-level challenges

According to AISI, no AI model could solve expert-level CTF tasks before April 2025. The institute places Mythos Preview in the top tier alongside GPT-5.4, Codex 5.3, and Claude Opus 4.6 for beginner-level performance.

Full network takeover simulation

AISI developed "The Last Ones" (TLO), a 32-step attack simulation against a simulated corporate network that would take human experts an estimated 20 hours to complete. With a 100 million token budget, Mythos Preview:

  • Completed all 32 steps in 3 out of 10 attempts
  • Averaged 22 of 32 steps across all attempts
  • Outperformed Claude Opus 4.6, which averaged 16 steps

The model is the first to complete this end-to-end attack simulation, according to AISI.

Significant limitations in testing

AISI notes critical caveats: the test environments contained no active defenders, no security monitoring tools, and no consequences for actions that would trigger alarms on real networks. "There's no way to tell whether Mythos Preview could successfully breach a well-defended system" based on these results alone, the institute states.

The model also failed to complete a separate AISI simulation targeting industrial control systems, stalling in the IT network during earlier stages.

AISI concludes the model can "autonomously attack small, weakly defended and vulnerable enterprise systems where access to a network has been gained." Future evaluations will include hardened environments with active monitoring and incident response.

Limited availability and controversy

Anthropic launched Claude Mythos in early April but currently limits access to approximately 50 companies, reportedly due to cybersecurity concerns. Critics compare this to OpenAI's 2019 decision to restrict GPT-2, arguing the performance gains don't justify such limited access. Some speculate the restrictions are primarily for marketing purposes or due to compute capacity constraints.

What this means

This represents the first documented case of an AI model completing a full multi-stage network attack simulation autonomously. The 73% expert-level CTF score and ability to chain 22+ attack steps shows measurable advancement in AI cyber capabilities over 2025 models.

However, the absence of active defenses in testing leaves the practical threat level unclear. Real enterprise networks deploy endpoint detection, security monitoring, and incident response—none of which were present in AISI's simulations. The results highlight that AI models can now exploit basic security weaknesses at scale, reinforcing the importance of fundamental security practices like regular patching and strong access controls.

AISI and the UK's National Cyber Security Centre note these capabilities are dual-use: the same techniques that enable offensive operations could strengthen defensive cybersecurity systems.

Related Articles

model release

Anthropic's Unreleased Claude Mythos Preview Finds 10,000+ Vulnerabilities in One Month

Anthropic's unreleased Claude Mythos Preview model has discovered more than 10,000 vulnerabilities across partner organizations in its first month of deployment through Project Glasswing. The company reports partners are finding bugs at 10x their previous rate, with Cloudflare discovering 2,000 bugs and Mozilla finding 271 Firefox vulnerabilities — 10x more than with previous Claude models.

funding

Anthropic raises $65B at $965B valuation, releases Claude Opus 4.8, plans wider Mythos rollout

Anthropic closed a $65 billion Series H at a $965 billion valuation, making it the most valuable AI startup globally and surpassing OpenAI's $852 billion March valuation. The company simultaneously released Claude Opus 4.8 and announced plans to bring its Mythos cyber-focused model to all customers within weeks.

model release

Anthropic's Opus 4.8 matches Claude Mythos Preview in alignment, cuts thinking mode costs by 67%

Anthropic released Claude Opus 4.8 on May 28, 2026, replacing Opus 4.7 at unchanged pricing. The company claims the model's misalignment rates match those of Claude Mythos Preview, the experimental model deemed too dangerous for public release in April 2026. Opus 4.8 delivers faster thinking modes at one-third the cost of version 4.7.

model release

Anthropic releases Claude Opus 4.8 with 69.2% agentic coding score, 2.5x faster performance

Anthropic released Claude Opus 4.8 on May 28, 2026, six weeks after version 4.7. The model achieves 69.2% on agentic coding benchmarks (up from 64.3%), runs 2.5 times faster in fast mode at one-third the cost, while maintaining the same pricing as version 4.7.

Comments

Loading...