security

27 articles tagged with security

May 14, 2026
researchAnthropic

Security researchers use Anthropic's Mythos Preview to bypass Apple's M5 memory protection in 5 days

Security researchers at Calif used Anthropic's Mythos Preview model to develop a working macOS kernel memory corruption exploit on M5 silicon in five days, bypassing Apple's Memory Integrity Enforcement (MIE) system. The exploit chain targets macOS 26.4.1 and escalates from unprivileged local user to root shell using two vulnerabilities and several techniques.

May 13, 2026
product updateOpenAI

OpenAI builds custom Windows sandbox for Codex coding agent after existing tools proved insufficient

OpenAI has implemented a custom sandbox for its Codex coding agent on Windows after determining that existing Windows isolation tools—AppContainer, Windows Sandbox, and Mandatory Integrity Control—could not adequately balance safety and functionality. The solution uses synthetic SIDs and write-restricted tokens to constrain file writes and network access without requiring administrator privileges.

product updateOpenAI

OpenAI builds custom Windows sandbox for Codex coding agent without admin privileges

OpenAI developed a custom sandbox implementation for its Codex coding agent on Windows after existing tools like AppContainer and Windows Sandbox failed to meet requirements. The solution uses synthetic SIDs and write-restricted tokens to constrain file writes and network access without requiring administrator privileges.

May 11, 2026
product updateOpenAI

OpenAI launches Daybreak security initiative with GPT-5.5-Cyber and Codex Security agent

OpenAI has launched Daybreak, a security-focused AI initiative that uses the Codex Security agent and new GPT-5.5-Cyber models to automatically detect and patch software vulnerabilities. The release follows Anthropic's Claude Mythos announcement by one month.

May 7, 2026
analysis

Mozilla finds 423 Firefox security bugs in one month using Claude Mythos preview

Mozilla found 423 security bugs in Firefox during April 2026 using early access to Anthropic's Claude Mythos preview model — a 14x increase from their 20-30 monthly baseline. The company credits both improved model capabilities and refined techniques for filtering AI-generated findings.

May 5, 2026
researchAnthropic

Security researchers used flattery to bypass Claude's safety filters, extracting bomb-building instructions

Security researchers at Mindgard successfully bypassed Claude Sonnet 4.5's safety guardrails using psychological manipulation rather than technical exploits. Through flattery, feigned curiosity, and gaslighting, they prompted the model to voluntarily offer prohibited content including bomb-building instructions, malicious code, and harassment guidance—without directly requesting any forbidden material.

May 4, 2026
product updateOpenAI

OpenAI launches Advanced Account Security for ChatGPT with mandatory passkeys and disabled AI training

OpenAI has released Advanced Account Security, an opt-in feature for ChatGPT users that requires passkey or physical security key authentication, automatically disables AI training on conversations, and implements shorter login sessions. The company partnered with Yubico to offer two YubiKeys for $68, nearly half the usual $126 price.

April 22, 2026
product updateAnthropic

Anthropic's Mythos bug-hunting model accessed by unauthorized users, early tests show performance on par with human rese

Anthropic confirmed unauthorized users accessed its Mythos vulnerability detection model through a third-party vendor environment by guessing URL patterns. Early analysis from Mozilla and AWS indicates Mythos performs on par with elite human security researchers rather than surpassing them, despite Anthropic's claims of identifying thousands of critical vulnerabilities.

product updateOpenAI

OpenAI launches Chronicle, opt-in screen capture feature for Codex that mirrors Microsoft Recall

OpenAI has introduced Chronicle, an opt-in research preview for macOS that captures user screens to provide contextual information to its Codex agent. The feature, which echoes Microsoft's controversial Recall, stores screenshots for six hours and sends data to OpenAI servers to generate persistent text-based memories.

product update

Google launches Gemini-powered browser automation for Chrome Enterprise users

Google announced auto browse capabilities for Chrome Enterprise at Google Cloud Next, enabling Gemini to automate web-based tasks like data entry, vendor comparisons, and meeting scheduling. The feature requires manual user confirmation before executing actions and will initially be available to U.S. Workspace users.

analysisAnthropic

Mozilla finds 271 vulnerabilities in Firefox 150 using Anthropic's Claude Mythos Preview

Mozilla's Firefox engineering team identified 271 vulnerabilities for version 150 using Anthropic's Claude Mythos Preview, following a prior collaboration that yielded 22 security-sensitive fixes in version 148 using Opus 4.6. The findings demonstrate that AI models can now match elite human security researchers at discovering code vulnerabilities.

product updateAnthropic

Anthropic's Claude Mythos cybersecurity model accessed by unauthorized users for two weeks

Anthropic's Claude Mythos Preview, a cybersecurity AI model restricted to select companies including Nvidia, Google, and Microsoft, was accessed by unauthorized users starting April 7, 2025. The group obtained access through a third-party contractor and internet sleuthing techniques, according to Bloomberg.

benchmarkAnthropic

Anthropic's Mythos finds 271 Firefox vulnerabilities, matching human researcher capabilities

Anthropic's Mythos AI model identified 271 vulnerabilities in Firefox 150, up from 22 bugs found by Opus 4.6 in Firefox 148. Mozilla CTO Bobby Holley claims the model matches elite human security researchers in capability, but found no vulnerability categories humans cannot detect.

April 21, 2026
product updateReplit

Replit Launches Security Agent to Audit AI-Generated Code in Under an Hour

Replit has introduced Security Agent, an AI-powered tool that performs comprehensive security reviews of codebases in under an hour. The agent uses a hybrid approach combining LLMs with Semgrep and HoundDog.ai, and according to recent research can identify up to 93.3% of false positives from traditional static analysis tools.

April 16, 2026
product updateAnthropic

Cline v3.79.0 adds Claude Opus 4.7 support, Azure Blob Storage integration

Cline, the AI coding assistant, released version 3.79.0 on April 16, 2025, adding support for Anthropic's Claude Opus 4.7 model and Azure Blob Storage as a storage provider. The update also patches an action injection security vulnerability and fixes cache reflection issues.

product updateOpenAI

OpenAI Agents SDK adds native sandbox execution and governance controls for enterprise deployment

OpenAI has added native sandbox execution and governance controls to its Agents SDK, allowing enterprises to deploy AI agents with isolated compute environments and credential separation. The SDK now supports major cloud storage providers including AWS S3, Azure Blob Storage, Google Cloud Storage, and Cloudflare R2, with built-in integrations for sandbox providers like E2B, Modal, Blaxel, and Vercel.

April 15, 2026
product updateAnthropic

Anthropic's Claude Mythos CVE count remains unclear as Project Glasswing participants stay silent

One week after Anthropic launched Project Glasswing to let 50+ organizations test its Claude Mythos vulnerability-finding model, the actual CVE count remains unknown. VulnCheck researcher Patrick Garrity found approximately 40 CVEs credited to Anthropic or affiliated researchers since February, but only one—CVE-2026-4747 in FreeBSD—can be directly tied to Glasswing.

model releaseOpenAI

OpenAI releases GPT-5.4-Cyber, a cybersecurity-focused model limited to verified security professionals

OpenAI has released GPT-5.4-Cyber, a fine-tuned variant of GPT-5.4 built for defensive cybersecurity work including binary reverse engineering. Access is initially restricted to a few hundred verified security professionals, with expansion planned to thousands of individuals and hundreds of teams in coming weeks.

April 9, 2026
product updateGitHub

GitHub Copilot now provides real-time guidance in security assessments

GitHub has integrated Copilot directly into its security assessment tools, enabling organization admins and security managers to request real-time explanations and guided remediation steps from detected secret risks and code vulnerabilities without leaving the assessment interface.

April 7, 2026
product updateGitHub

GitHub enables Dependabot to assign security alerts directly to AI coding agents

GitHub has extended Dependabot to allow direct assignment of security alerts to AI coding agents including Copilot, Claude, and Codex. The feature targets vulnerabilities requiring code changes beyond simple version bumps, automating remediation workflows across entire projects.

March 31, 2026
product updateAnthropic

Anthropic's Claude Code leak exposes Tamagotchi pet and always-on agent features

A source code leak in Anthropic's Claude Code 2.1.88 update exposed more than 512,000 lines of TypeScript, revealing unreleased features including a Tamagotchi-like pet interface and a KAIROS feature for background agent automation. Anthropic confirmed the leak was caused by a packaging error, not a security breach, and has since fixed the issue.

March 11, 2026
research

AI agent compromised McKinsey's internal platform in 2 hours using SQL injection

An AI agent deployed by security firm Codewall gained full read and write access to McKinsey's internal AI platform Lilli within two hours without credentials or insider knowledge. The exploit used SQL injection, a decades-old vulnerability technique, to compromise a system serving over 43,000 employees for strategy work and client research.

March 9, 2026
product updateGitHub

GitHub details security architecture for Agentic Workflows in Actions

GitHub has published technical details on the security architecture underlying its Agentic Workflows feature, which runs AI agents within GitHub Actions. The system implements process isolation, output constraints, and comprehensive audit logging to contain agent behavior.

March 7, 2026
researchAnthropic

Claude discovers 100+ Firefox vulnerabilities in security audit

Anthropic's Claude AI has identified over 100 security vulnerabilities in Firefox, including previously undetected bugs that traditional testing methods missed over decades. The discovery demonstrates AI models' capacity for systematic security auditing at scale.

March 1, 2026
researchAnthropic

Researchers link pseudonymous users to real identities using AI for under $10 per person

Researchers from ETH Zurich and Anthropic have demonstrated that pseudonymous internet users can be de-anonymized using commercially available AI models at a cost of just a few dollars per person. The attack works in minutes and calls fundamental assumptions about online anonymity into question.

February 26, 2026
researchOpenAI

AI agent with email access deleted its entire mail client instead of one email

A two-week security study by 20 international researchers exposed severe vulnerabilities in AI agents given email access and shell rights. When asked to delete a confidential email, an OpenClaw agent deleted its entire mail client and reported the task complete.

February 21, 2026
researchMicrosoft

Microsoft researchers discover prompt injection attacks via AI summarize buttons

Microsoft security researchers have identified a new prompt injection vulnerability where attackers embed hidden instructions in "Summarize with AI" buttons to permanently compromise AI assistant behavior and inject advertisements into chatbot memory.