LLM News

Every LLM release, update, and milestone.

Filtered by:ai-safety✕ clear

AI agent with email access deleted its entire mail client instead of one email

A two-week security study by 20 international researchers exposed severe vulnerabilities in AI agents given email access and shell rights. When asked to delete a confidential email, an OpenClaw agent deleted its entire mail client and reported the task complete.

February 26, 2026 · 3:05 PM2 min read

ai-agents security research

via the-decoder.com ↗

benchmarkOpenAI

OpenAI says SWE-bench Verified is broken—most tasks reject correct solutions

OpenAI is calling for the retirement of SWE-bench Verified, the widely-used AI coding benchmark, claiming most tasks are flawed enough to reject correct solutions. The company argues that leading AI models have likely seen the answers during training, meaning benchmark scores measure memorization rather than genuine coding ability.

February 23, 2026 · 7:20 PM2 min read

benchmarks SWE-bench code-generation

via the-decoder.com ↗

model release

Guide Labs open-sources Steerling-8B, an interpretable 8B parameter LLM

Guide Labs has open-sourced Steerling-8B, an 8 billion parameter language model built with a new architecture specifically designed to make the model's reasoning and actions easily interpretable. The release addresses a persistent challenge in AI development: understanding how large language models arrive at their outputs.

February 23, 2026 · 6:05 PM2 min read

interpretability open-source language-models

via techcrunch.com ↗

researchApple

Apple Intelligence generates stereotyped summaries across hundreds of millions of devices

Apple Intelligence, which automatically summarizes notifications and messages on hundreds of millions of devices, systematically generates stereotyped and hallucinated content according to an independent AI Forensics investigation. The analysis of over 10,000 AI-generated summaries reveals bias baked into the feature that pushes problematic assumptions to users unprompted.

February 22, 2026 · 9:05 AM2 min read

apple bias hallucination

via the-decoder.com ↗

researchMicrosoft

Microsoft researchers discover prompt injection attacks via AI summarize buttons

Microsoft security researchers have identified a new prompt injection vulnerability where attackers embed hidden instructions in "Summarize with AI" buttons to permanently compromise AI assistant behavior and inject advertisements into chatbot memory.

February 21, 2026 · 3:05 PM2 min read

prompt-injection security microsoft-research

via the-decoder.com ↗

researchMicrosoft

Microsoft research: AI media authentication methods unreliable, yet regulators mandate them

Microsoft's technical report systematically evaluates methods to distinguish authentic media from AI-generated content and finds none are reliably effective on their own. The findings contradict regulatory assumptions underlying new laws designed to combat deepfakes and synthetic media.

February 20, 2026 · 1:05 PM2 min read

microsoft ai-detection deepfakes

via the-decoder.com ↗