model releaseAnthropic

Anthropic confirms leaked model represents major reasoning advance after security breach

TL;DR

A data breach at Anthropic exposed internal documents detailing an unreleased AI model the company describes as its most powerful to date. Anthropic confirmed it is already testing the model with select customers, claiming significant advances in reasoning, coding, and cybersecurity. The breach resulted from a misconfiguration in Anthropic's content management system that automatically made ~3,000 uploaded files publicly accessible.

2 min read
1

Anthropic Confirms Leaked Model Marks 'Step Change' in Reasoning After Data Breach

A security misconfiguration at Anthropic has exposed internal documents revealing details of an unreleased AI model that internal teams describe as the company's most capable to date. After Fortune reported the breach, Anthropic confirmed it is already testing the model with select customers, characterizing it as marking a "step change" in reasoning, coding, and cybersecurity capabilities.

How the Breach Occurred

The exposure resulted from a misconfiguration in Anthropic's content management system. A default setting automatically made all uploaded files public, leaving approximately 3,000 internal documents accessible to the internet without authentication or authorization controls.

The specific technical details about the unreleased model remain limited, as Anthropic has not officially announced the system or provided performance metrics, parameter counts, or availability timelines.

Broader Context: OpenAI Also Preparing Major Release

Anthropics's unreleased model announcement comes as OpenAI reportedly prepares its own major capability jump. OpenAI is developing a model internally codenamed "Spud," which has completed pretraining. CEO Sam Altman has internally stated the model can "really accelerate the economy," though OpenAI has not officially disclosed specifics about the system's architecture, capabilities, or release date.

Strategic Timing and IPO Implications

Both companies may be timing their strongest model releases to align with planned IPO activity later in 2026. This suggests major announcements could arrive within the coming months as both organizations position themselves for public markets.

Neither Anthropic nor OpenAI has provided confirmed details about pricing, context window size, benchmark scores, or other technical specifications for their respective unreleased models.

What This Means

The breach underscores persistent security challenges in AI infrastructure, even at well-resourced organizations. More significantly, both Anthropic and OpenAI are signaling that substantial capability improvements are imminent—though neither company has yet demonstrated these advances publicly. The gap between internal testing and public release typically involves safety evaluation, red-teaming, and regulatory assessment, meaning months may pass before customers can access either system. The coincidence of parallel releases from both organizations suggests an intensifying arms race in AI capability development heading into 2026.

Related Articles

model release

OpenAI offers EU preview access to GPT-5.5-Cyber model while Anthropic withholds Mythos

OpenAI announced GPT-5.5-Cyber is rolling out in limited preview to vetted cybersecurity teams and is in discussions with the European Commission about preview access. Anthropic released its Mythos model a month ago but has yet to grant EU access for security review.

model release

OpenAI releases GPT-5.5-Cyber for vetted security teams with relaxed safeguards

OpenAI released GPT-5.5-Cyber in limited preview on Thursday, a variant of its GPT-5.5 model with relaxed safeguards for vetted cybersecurity teams. The model is trained to be more permissive on security-related tasks including vulnerability identification, patch validation, and malware analysis.

model release

OpenAI Opens GPT-5.5-Cyber to Vetted Defenders After Model Matches Anthropic's Mythos in Security Testing

OpenAI is providing a less-restricted version of GPT-5.5 to vetted cybersecurity defenders through its Trusted Access for Cyber program. The model, dubbed GPT-5.5-Cyber, completed a 32-step simulated corporate cyberattack in 2 out of 10 test runs according to the U.K. AI Security Institute, narrowly trailing Anthropic's Mythos which succeeded in 3 out of 10 attempts.

model release

Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens

Google has released Gemini 3.1 Flash Lite, a high-efficiency multimodal model with a 1,048,576 token context window priced at $0.25 per million input tokens and $1.50 per million output tokens. The model supports text, image, video, audio, and PDF inputs with four thinking levels for cost-performance optimization.

Comments

Loading...