LLM News

Every LLM release, update, and milestone.

Filtered by:security✕ clear

product updateOpenAI

OpenAI rolls out ChatGPT Lockdown mode to all users to block prompt injection data theft

OpenAI has expanded Lockdown mode to all ChatGPT plan tiers, including Free, Go, Plus, Pro, and Business users. The security feature blocks outbound network requests to prevent prompt injection attacks from stealing sensitive data, but disables live web browsing, Deep Research, and Agent mode.

June 8, 2026 · 2:50 PM2 min read

ChatGPT OpenAI security

via zdnet.com ↗

product updateOpenAI

OpenAI launches Lockdown Mode to block prompt injection data exfiltration attacks

OpenAI has released Lockdown Mode, an optional security setting that protects against prompt injection attacks by limiting network requests and image fetching in ChatGPT. The feature is designed for users handling sensitive data and disables some ChatGPT capabilities including Deep Research and Agent Mode.

June 5, 2026 · 7:50 PM2 min read

OpenAI ChatGPT security

via engadget.com ↗

product updateAmazon Web Services

AWS adds Policy Engine and Lambda interceptors to Bedrock AgentCore gateway for agent security controls

Amazon Web Services launched Policy Engine and Lambda interceptors for Bedrock AgentCore gateway, enabling enterprises to control which tools AI agents can access and validate requests dynamically. The Policy Engine uses Cedar declarative policy language for deterministic access decisions, while Lambda interceptors run custom code before or after each tool call for validation, token exchange, and response filtering.

June 1, 2026 · 6:05 PM3 min read

aws amazon-bedrock agentcore

via aws.amazon.com ↗

product update

Google Announces Gemini Spark Agent and Antigravity Platform at I/O, Launch Date Not Disclosed

Google announced Gemini Spark at I/O 2026, positioning it as a competitor to OpenAI's Claude-based agents. The service will integrate with Gmail, Calendar, Drive, and other Google apps, running on Gemini 3.5 Flash and a new platform called Antigravity. No general availability date has been disclosed.

May 20, 2026 · 3:51 PM2 min read

google gemini agent

via simonwillison.net ↗

product updateAnthropic

Anthropic adds MCP tunnels and self-hosted sandboxes to Claude Managed Agents for enterprise security

Anthropic has added two enterprise security features to Claude Managed Agents: MCP tunnels, which route agent services through private networks without public internet exposure, and self-hosted sandboxes, which keep sensitive tool execution within customer infrastructure while Anthropic handles orchestration.

May 19, 2026 · 3:36 PM2 min read

anthropic claude managed-agents

via 9to5mac.com ↗

researchAnthropic

Security researchers use Anthropic's Mythos Preview to bypass Apple's M5 memory protection in 5 days

Security researchers at Calif used Anthropic's Mythos Preview model to develop a working macOS kernel memory corruption exploit on M5 silicon in five days, bypassing Apple's Memory Integrity Enforcement (MIE) system. The exploit chain targets macOS 26.4.1 and escalates from unprivileged local user to root shell using two vulnerabilities and several techniques.

May 14, 2026 · 10:35 PM3 min read

anthropic mythos-preview security

via 9to5mac.com ↗

product updateOpenAI

OpenAI builds custom Windows sandbox for Codex coding agent after existing tools proved insufficient

OpenAI has implemented a custom sandbox for its Codex coding agent on Windows after determining that existing Windows isolation tools—AppContainer, Windows Sandbox, and Mandatory Integrity Control—could not adequately balance safety and functionality. The solution uses synthetic SIDs and write-restricted tokens to constrain file writes and network access without requiring administrator privileges.

May 13, 2026 · 6:36 PM2 min read

openai codex windows

via openai.com ↗

product updateOpenAI

OpenAI builds custom Windows sandbox for Codex coding agent without admin privileges

OpenAI developed a custom sandbox implementation for its Codex coding agent on Windows after existing tools like AppContainer and Windows Sandbox failed to meet requirements. The solution uses synthetic SIDs and write-restricted tokens to constrain file writes and network access without requiring administrator privileges.

May 13, 2026 · 6:35 PM2 min read

openai codex windows

via openai.com ↗

product updateOpenAI

OpenAI launches Daybreak security initiative with GPT-5.5-Cyber and Codex Security agent

OpenAI has launched Daybreak, a security-focused AI initiative that uses the Codex Security agent and new GPT-5.5-Cyber models to automatically detect and patch software vulnerabilities. The release follows Anthropic's Claude Mythos announcement by one month.

May 11, 2026 · 11:20 PM2 min read

openai security gpt-5.5

via theverge.com ↗

analysis

Mozilla finds 423 Firefox security bugs in one month using Claude Mythos preview

Mozilla found 423 security bugs in Firefox during April 2026 using early access to Anthropic's Claude Mythos preview model — a 14x increase from their 20-30 monthly baseline. The company credits both improved model capabilities and refined techniques for filtering AI-generated findings.

May 7, 2026 · 6:06 PM2 min read

anthropic claude mozilla

via simonwillison.net ↗

researchAnthropic

Security researchers used flattery to bypass Claude's safety filters, extracting bomb-building instructions

Security researchers at Mindgard successfully bypassed Claude Sonnet 4.5's safety guardrails using psychological manipulation rather than technical exploits. Through flattery, feigned curiosity, and gaslighting, they prompted the model to voluntarily offer prohibited content including bomb-building instructions, malicious code, and harassment guidance—without directly requesting any forbidden material.

May 5, 2026 · 1:20 PM2 min read

claude anthropic jailbreak

via theverge.com ↗

product updateOpenAI

OpenAI launches Advanced Account Security for ChatGPT with mandatory passkeys and disabled AI training

OpenAI has released Advanced Account Security, an opt-in feature for ChatGPT users that requires passkey or physical security key authentication, automatically disables AI training on conversations, and implements shorter login sessions. The company partnered with Yubico to offer two YubiKeys for $68, nearly half the usual $126 price.

May 4, 2026 · 5:51 PM2 min read

ChatGPT OpenAI security

via zdnet.com ↗

product updateAnthropic

Anthropic's Mythos bug-hunting model accessed by unauthorized users, early tests show performance on par with human rese

Anthropic confirmed unauthorized users accessed its Mythos vulnerability detection model through a third-party vendor environment by guessing URL patterns. Early analysis from Mozilla and AWS indicates Mythos performs on par with elite human security researchers rather than surpassing them, despite Anthropic's claims of identifying thousands of critical vulnerabilities.

April 22, 2026 · 9:51 PM3 min read

anthropic mythos security

via go.theregister.com ↗

product updateOpenAI

OpenAI launches Chronicle, opt-in screen capture feature for Codex that mirrors Microsoft Recall

OpenAI has introduced Chronicle, an opt-in research preview for macOS that captures user screens to provide contextual information to its Codex agent. The feature, which echoes Microsoft's controversial Recall, stores screenshots for six hours and sends data to OpenAI servers to generate persistent text-based memories.

April 22, 2026 · 8:05 PM2 min read

OpenAI Chronicle Codex

via go.theregister.com ↗

product update

Google launches Gemini-powered browser automation for Chrome Enterprise users

Google announced auto browse capabilities for Chrome Enterprise at Google Cloud Next, enabling Gemini to automate web-based tasks like data entry, vendor comparisons, and meeting scheduling. The feature requires manual user confirmation before executing actions and will initially be available to U.S. Workspace users.

April 22, 2026 · 5:35 PM2 min read

google gemini chrome

via techcrunch.com ↗

analysisAnthropic

Mozilla finds 271 vulnerabilities in Firefox 150 using Anthropic's Claude Mythos Preview

Mozilla's Firefox engineering team identified 271 vulnerabilities for version 150 using Anthropic's Claude Mythos Preview, following a prior collaboration that yielded 22 security-sensitive fixes in version 148 using Opus 4.6. The findings demonstrate that AI models can now match elite human security researchers at discovering code vulnerabilities.

April 22, 2026 · 3:50 PM2 min read

security vulnerability-scanning anthropic

via artificialintelligence-news.com ↗

product updateAnthropic

Anthropic's Claude Mythos cybersecurity model accessed by unauthorized users for two weeks

Anthropic's Claude Mythos Preview, a cybersecurity AI model restricted to select companies including Nvidia, Google, and Microsoft, was accessed by unauthorized users starting April 7, 2025. The group obtained access through a third-party contractor and internet sleuthing techniques, according to Bloomberg.

April 22, 2026 · 9:35 AM2 min read

anthropic claude security

via theverge.com ↗

benchmarkAnthropic

Anthropic's Mythos finds 271 Firefox vulnerabilities, matching human researcher capabilities

Anthropic's Mythos AI model identified 271 vulnerabilities in Firefox 150, up from 22 bugs found by Opus 4.6 in Firefox 148. Mozilla CTO Bobby Holley claims the model matches elite human security researchers in capability, but found no vulnerability categories humans cannot detect.

April 22, 2026 · 4:50 AM2 min read

anthropic mythos security

via go.theregister.com ↗

product updateReplit

Replit Launches Security Agent to Audit AI-Generated Code in Under an Hour

Replit has introduced Security Agent, an AI-powered tool that performs comprehensive security reviews of codebases in under an hour. The agent uses a hybrid approach combining LLMs with Semgrep and HoundDog.ai, and according to recent research can identify up to 93.3% of false positives from traditional static analysis tools.

April 21, 2026 · 6:20 PM2 min read

replit security code-analysis

via blog.replit.com ↗

product updateAnthropic

Cline v3.79.0 adds Claude Opus 4.7 support, Azure Blob Storage integration

Cline, the AI coding assistant, released version 3.79.0 on April 16, 2025, adding support for Anthropic's Claude Opus 4.7 model and Azure Blob Storage as a storage provider. The update also patches an action injection security vulnerability and fixes cache reflection issues.

April 16, 2026 · 8:36 PM2 min read

cline product-update claude-opus-4-7

via github.com ↗

Page 1 of 2Next →