LLM News

Every LLM release, update, and milestone.

Filtered by:security✕ clear
0
product updateAnthropic

Anthropic's Mythos bug-hunting model accessed by unauthorized users, early tests show performance on par with human rese

Anthropic confirmed unauthorized users accessed its Mythos vulnerability detection model through a third-party vendor environment by guessing URL patterns. Early analysis from Mozilla and AWS indicates Mythos performs on par with elite human security researchers rather than surpassing them, despite Anthropic's claims of identifying thousands of critical vulnerabilities.

3 min readvia go.theregister.com
0
product updateOpenAI

OpenAI launches Chronicle, opt-in screen capture feature for Codex that mirrors Microsoft Recall

OpenAI has introduced Chronicle, an opt-in research preview for macOS that captures user screens to provide contextual information to its Codex agent. The feature, which echoes Microsoft's controversial Recall, stores screenshots for six hours and sends data to OpenAI servers to generate persistent text-based memories.

2 min readvia go.theregister.com
0
analysisAnthropic

Mozilla finds 271 vulnerabilities in Firefox 150 using Anthropic's Claude Mythos Preview

Mozilla's Firefox engineering team identified 271 vulnerabilities for version 150 using Anthropic's Claude Mythos Preview, following a prior collaboration that yielded 22 security-sensitive fixes in version 148 using Opus 4.6. The findings demonstrate that AI models can now match elite human security researchers at discovering code vulnerabilities.

0
product updateAnthropic

Anthropic's Claude Mythos cybersecurity model accessed by unauthorized users for two weeks

Anthropic's Claude Mythos Preview, a cybersecurity AI model restricted to select companies including Nvidia, Google, and Microsoft, was accessed by unauthorized users starting April 7, 2025. The group obtained access through a third-party contractor and internet sleuthing techniques, according to Bloomberg.

2 min readvia theverge.com
0
product updateReplit

Replit Launches Security Agent to Audit AI-Generated Code in Under an Hour

Replit has introduced Security Agent, an AI-powered tool that performs comprehensive security reviews of codebases in under an hour. The agent uses a hybrid approach combining LLMs with Semgrep and HoundDog.ai, and according to recent research can identify up to 93.3% of false positives from traditional static analysis tools.

2 min readvia blog.replit.com
0
product updateOpenAI

OpenAI Agents SDK adds native sandbox execution and governance controls for enterprise deployment

OpenAI has added native sandbox execution and governance controls to its Agents SDK, allowing enterprises to deploy AI agents with isolated compute environments and credential separation. The SDK now supports major cloud storage providers including AWS S3, Azure Blob Storage, Google Cloud Storage, and Cloudflare R2, with built-in integrations for sandbox providers like E2B, Modal, Blaxel, and Vercel.

0
product updateAnthropic

Anthropic's Claude Mythos CVE count remains unclear as Project Glasswing participants stay silent

One week after Anthropic launched Project Glasswing to let 50+ organizations test its Claude Mythos vulnerability-finding model, the actual CVE count remains unknown. VulnCheck researcher Patrick Garrity found approximately 40 CVEs credited to Anthropic or affiliated researchers since February, but only one—CVE-2026-4747 in FreeBSD—can be directly tied to Glasswing.

0
model releaseOpenAI

OpenAI releases GPT-5.4-Cyber, a cybersecurity-focused model limited to verified security professionals

OpenAI has released GPT-5.4-Cyber, a fine-tuned variant of GPT-5.4 built for defensive cybersecurity work including binary reverse engineering. Access is initially restricted to a few hundred verified security professionals, with expansion planned to thousands of individuals and hundreds of teams in coming weeks.

0
product updateAnthropic

Anthropic's Claude Code leak exposes Tamagotchi pet and always-on agent features

A source code leak in Anthropic's Claude Code 2.1.88 update exposed more than 512,000 lines of TypeScript, revealing unreleased features including a Tamagotchi-like pet interface and a KAIROS feature for background agent automation. Anthropic confirmed the leak was caused by a packaging error, not a security breach, and has since fixed the issue.

2 min readvia theverge.com
0
research

AI agent compromised McKinsey's internal platform in 2 hours using SQL injection

An AI agent deployed by security firm Codewall gained full read and write access to McKinsey's internal AI platform Lilli within two hours without credentials or insider knowledge. The exploit used SQL injection, a decades-old vulnerability technique, to compromise a system serving over 43,000 employees for strategy work and client research.