model releaseAnthropic

Anthropic restricts Claude Mythos to security researchers under Project Glasswing

TL;DR

Anthropic has not publicly released Claude Mythos, instead restricting access to a vetted set of partners through Project Glasswing. The company claims the model's cybersecurity research abilities—including finding thousands of high-severity vulnerabilities in major operating systems and browsers—warrant controlled deployment until industry safeguards mature.

2 min read
0

Anthropic Restricts Claude Mythos to Security Researchers Under Project Glasswing

Anthropic has withheld public release of Claude Mythos, instead limiting access through Project Glasswing, a controlled preview program for vetted security partners. The company cites the model's exceptional cybersecurity research capabilities as the reason for restricted deployment.

The Model and Its Capabilities

Claude Mythos is described as a general-purpose model comparable to Claude Opus 4.6. According to Anthropic's system card, the model has already discovered thousands of high-severity vulnerabilities, including flaws in every major operating system and web browser. Notably, the model can chain multiple vulnerabilities together—combining three, four, or sometimes five separate issues to create sophisticated multi-stage exploits that wouldn't be possible with individual vulnerabilities alone.

Real-World Vulnerability Discoveries

Nicholas Carlini, an Anthropic researcher, demonstrated the model's capabilities in Project Glasswing documentation. He reported finding "more bugs in the last couple of weeks than I found in the rest of my life combined." Specific discoveries include:

  • OpenBSD: A 27-year-old TCP SACK validation vulnerability that could crash the kernel. The flaw was patched on March 25, 2026 (CVE reference: OpenBSD 7.8 errata 025).
  • Linux: Multiple privilege escalation vulnerabilities allowing unprivileged users to gain administrator access.

These findings align with recent warnings from prominent security figures. Greg Kroah-Hartman of the Linux kernel noted that AI-generated security reports shifted from low-quality "AI slop" to credible, actionable vulnerabilities within the past month. Daniel Stenberg of curl reported spending hours daily handling AI-discovered security issues—"less slop but lots of reports. Many of them really good."

Project Glasswing Structure

The program provides $100M in usage credits plus $4M in direct donations to open-source security organizations. Partner organizations—including AWS, Apple, Microsoft, Google, and the Linux Foundation—receive early access to find and remediate vulnerabilities in foundational systems before broader vulnerability proliferation occurs.

Anthropic explicitly states: "We do not plan to make Claude Mythos Preview generally available," indicating this is a permanent access restriction, not a temporary embargo. The company plans to eventually enable safe large-scale deployment through safeguards designed to "detect and block the model's most dangerous outputs," with new protections expected in an upcoming Claude Opus model.

What This Means

This represents a significant shift in how frontier AI labs approach model release. Anthropic has essentially declared certain capabilities too risky for unrestricted distribution, betting that an industry preparation period—funded and coordinated through Project Glasswing—will reduce overall security risk more effectively than immediate public release would.

The question is whether this approach will hold. Other frontier labs like OpenAI (with GPT-5.4 already showing strong security research capabilities) have not adopted similar restrictions. A fragmented approach where some labs restrict access while others don't could create perverse incentives—security researchers and malicious actors alike may migrate to unrestricted alternatives, potentially accelerating rather than delaying vulnerability proliferation. Anthropic's success depends partly on industry adoption of their safeguards and on whether competitors follow suit.

Related Articles

changelog

Anthropic Python SDK v0.104.0 adds thinking token count estimates for streaming responses

Anthropic released version 0.104.0 of its Python SDK on May 21, 2026. The update adds support for a thinking-token-count beta feature that provides estimated token counts in thinking block deltas when streaming responses from reasoning models.

product update

Anthropic adds MCP tunnels and self-hosted sandboxes to Claude Managed Agents for enterprise security

Anthropic has added two enterprise security features to Claude Managed Agents: MCP tunnels, which route agent services through private networks without public internet exposure, and self-hosted sandboxes, which keep sensitive tool execution within customer infrastructure while Anthropic handles orchestration.

model release

NVIDIA releases Nemotron-Labs-Diffusion-14B with tri-mode decoding achieving 3.3x speed-up on GB200

NVIDIA released Nemotron-Labs-Diffusion-14B, a 14-billion parameter language model that supports three decoding modes by switching attention patterns during inference. The model achieves 850 tokens per second on GB200 hardware at concurrency 1, representing a 3.3x speed-up over standard autoregressive decoding and outperforming Qwen3-8B-Eagle3 by 2.2x in self-speculation mode.

model release

Google releases Gemini 3.5 Flash and autonomous agent Gemini Spark at I/O 2026

Google announced Gemini 3.5 Flash and Gemini Spark at I/O 2026. Gemini 3.5 Flash now powers Google's AI Mode search, while Spark is a cloud-based autonomous agent that can monitor credit card statements, track emails, and interact with third-party services like OpenTable and Instacart.

Comments

Loading...