ai-security

14 articles tagged with ai-security

May 7, 2026
analysis

Mozilla finds 423 Firefox security bugs in one month using Claude Mythos preview

Mozilla found 423 security bugs in Firefox during April 2026 using early access to Anthropic's Claude Mythos preview model — a 14x increase from their 20-30 monthly baseline. The company credits both improved model capabilities and refined techniques for filtering AI-generated findings.

April 14, 2026
analysisAnthropic

UK AI Safety Institute confirms Claude Mythos finds more exploits as token spend increases

The UK's AI Safety Institute published an independent evaluation confirming Anthropic's Claude Mythos is highly effective at finding security vulnerabilities. The evaluation revealed a linear relationship: more tokens spent equals more exploits discovered, transforming security into an economic arms race.

April 12, 2026
model releaseAnthropic

Anthropic launches Mythos AI model claiming zero-day vulnerability discovery capabilities

Anthropic has launched Mythos, an AI model the company claims can identify and exploit zero-day vulnerabilities with significant capability. The model has not been released publicly, with Anthropic citing security concerns. The announcement raises questions about the model's actual capabilities versus pre-IPO positioning.

April 10, 2026
model releaseAnthropic+1

White House officials questioned tech CEOs on AI security ahead of Anthropic's Mythos release

Vice President JD Vance and Treasury Secretary Scott Bessent held a call with leading tech CEOs including Anthropic's Dario Amodei, OpenAI's Sam Altman, and Google's Sundar Pichai to discuss AI model security and cyber attack response. The meeting occurred one week before Anthropic released its Mythos model, which has major cybersecurity implications and raised concerns at the Federal Reserve and among top U.S. banks.

April 8, 2026
researchAnthropic

Anthropic's Mythos AI generates working zero-day exploits 72.4% of the time, won't release publicly

Anthropic has developed Mythos, an AI model capable of generating working zero-day exploits with a 72.4% success rate, compared to Claude Opus 4.6's near-zero capability. The company declined public release due to security risks and instead created Project Glasswing, a limited-access program for 40+ organizations including AWS, Apple, Google, and Microsoft to find vulnerabilities in their own systems.

April 7, 2026
product updateAnthropic

Anthropic launches Project Glasswing to defend critical software against AI-powered attacks

Anthropic has announced Project Glasswing, a new initiative to secure critical software infrastructure against AI-powered attacks. The project includes 11 major partners including Amazon, Apple, Google, Microsoft, and NVIDIA, and will use Claude Mythos Preview, an unreleased general-purpose model from Anthropic that claims to have found thousands of exploitable vulnerabilities across major operating systems and web browsers.

model releaseAnthropic

Anthropic restricts Claude Mythos to security researchers under Project Glasswing

Anthropic has not publicly released Claude Mythos, instead restricting access to a vetted set of partners through Project Glasswing. The company claims the model's cybersecurity research abilities—including finding thousands of high-severity vulnerabilities in major operating systems and browsers—warrant controlled deployment until industry safeguards mature.

product updateApple

Apple, Google, Microsoft join Anthropic's Project Glasswing to find critical software vulnerabilities

Twelve major technology companies—including Apple, Google, Microsoft, Amazon, and Nvidia—have launched Project Glasswing, a coordinated effort to identify and patch critical software vulnerabilities using Anthropic's unreleased Mythos Preview model. The initiative discovered thousands of zero-day vulnerabilities in mission-critical software, including a 27-year-old bug in OpenBSD and a 16-year-old vulnerability in widely-used video software that automated testing tools had missed.

April 1, 2026

Google Deepmind identifies six attack categories that can hijack autonomous AI agents

A Google Deepmind paper introduces the first systematic framework for 'AI agent traps'—attacks that exploit autonomous agents' vulnerabilities to external tools and internet access. The researchers identify six attack categories targeting perception, reasoning, memory, actions, multi-agent networks, and human supervisors, with proof-of-concept demonstrations for each.

March 14, 2026
product updateOpenAI

OpenAI launches Codex Security research preview for AI-powered vulnerability detection

OpenAI has released Codex Security as a research preview, an AI application security agent designed to detect and patch complex code vulnerabilities. The tool analyzes project context to reduce noise and increase confidence in vulnerability detection.

March 12, 2026
product updateOpenAI

OpenAI acquires Promptfoo, an AI security and testing platform

OpenAI is acquiring Promptfoo, an AI security platform that helps enterprises identify and remediate vulnerabilities in AI systems during development. Terms of the acquisition were not disclosed.

March 9, 2026
product updateOpenAI

OpenAI acquires Promptfoo, integrates security testing into Frontier platform

OpenAI is acquiring Promptfoo, an AI security platform, to integrate automated vulnerability testing directly into its Frontier enterprise offering. The acquisition adds jailbreak detection, prompt injection testing, and data leak identification capabilities to OpenAI's enterprise product.

March 7, 2026
researchAnthropic

Claude discovers 100+ Firefox vulnerabilities in security audit

Anthropic's Claude AI has identified over 100 security vulnerabilities in Firefox, including previously undetected bugs that traditional testing methods missed over decades. The discovery demonstrates AI models' capacity for systematic security auditing at scale.

February 21, 2026
product updateAnthropic

Anthropic launches Claude Code Security tool; cybersecurity stocks fall

Anthropic has released Claude Code Security, an AI tool designed to identify code vulnerabilities that traditional security scanners overlook. The announcement prompted an immediate decline in cybersecurity stock valuations.