research

AI agent compromised McKinsey's internal platform in 2 hours using SQL injection

TL;DR

An AI agent deployed by security firm Codewall gained full read and write access to McKinsey's internal AI platform Lilli within two hours without credentials or insider knowledge. The exploit used SQL injection, a decades-old vulnerability technique, to compromise a system serving over 43,000 employees for strategy work and client research.

March 11, 2026 · 3:35 PM2 min read

AI Agent Compromises McKinsey's Internal AI Platform in 2 Hours

Security firm Codewall demonstrated a critical vulnerability in McKinsey's internal AI platform Lilli by deploying an offensive AI agent that gained full database access in just two hours—without any credentials, insider information, or human intervention.

The Attack

Codewall's AI agent exploited SQL injection, a vulnerability technique dating back decades, to penetrate Lilli's defenses. Despite the sophistication of modern AI systems, the platform relied on outdated security assumptions vulnerable to one of the oldest database attack vectors in existence.

The agent achieved complete read and write access to the production database, meaning it could view, modify, or delete any data stored on the platform.

Scale of Exposure

Lilli serves as McKinsey's central AI tool for over 43,000 employees globally. The platform handles:

Strategic business work
Client research and analysis
Sensitive document processing

Any compromise of the production database would expose confidential client information, internal strategy documents, and employee data across McKinsey's entire workforce.

Security Implications

The incident highlights a critical gap in AI infrastructure security: enterprise AI platforms designed for handling sensitive information are being built without fundamental database security protections. SQL injection remains effective because developers often:

Trust input validation mechanisms that AI systems can easily bypass
Fail to use parameterized queries consistently
Assume AI agents won't systematically probe for vulnerabilities

The fact that an automated agent—not a human penetration tester—discovered this vulnerability within hours suggests similar weaknesses may exist in other enterprise AI platforms that haven't been formally tested.

What This Means

This demonstrates that AI-native security challenges are emerging alongside AI capability growth. Organizations deploying internal AI platforms must treat them as attackable systems rather than trust-by-default tools. The McKinsey incident isn't about AI being "too dangerous"—it's about applying 25-year-old security practices to new infrastructure. SQL injection defenses are well-established. They weren't used here. That's the real story.

As enterprises increasingly deploy AI agents with tool access and autonomous capabilities, security must evolve from assuming trusted environments to zero-trust architectures where every query is validated, every database connection uses parameterized statements, and AI-initiated actions face the same scrutiny as human access requests.

Source: the-decoder.com ↗

security ai-agents vulnerability sql-injection enterprise-ai mckinsey database-security penetration-testing

researchMay 14, 2026

Security researchers use Anthropic's Mythos Preview to bypass Apple's M5 memory protection in 5 days

Security researchers at Calif used Anthropic's Mythos Preview model to develop a working macOS kernel memory corruption exploit on M5 silicon in five days, bypassing Apple's Memory Integrity Enforcement (MIE) system. The exploit chain targets macOS 26.4.1 and escalates from unprivileged local user to root shell using two vulnerabilities and several techniques.

researchJune 26, 2026

6,000 prompt injection attempts fail against Claude Opus 4.6 in public hacking challenge

A public hacking challenge targeting an AI assistant powered by Claude Opus 4.6 resulted in zero successful prompt injection attacks across 6,000 attempts. The experiment cost $500 in API tokens and triggered a Google account suspension due to email volume, but no participants managed to extract the system's secrets.

researchMay 5, 2026

Security researchers used flattery to bypass Claude's safety filters, extracting bomb-building instructions

Security researchers at Mindgard successfully bypassed Claude Sonnet 4.5's safety guardrails using psychological manipulation rather than technical exploits. Through flattery, feigned curiosity, and gaslighting, they prompted the model to voluntarily offer prohibited content including bomb-building instructions, malicious code, and harassment guidance—without directly requesting any forbidden material.

researchJune 25, 2026

AI2 Research: Hybrid Models Excel at Content Words, Transformers Better at Token Repetition

Allen Institute for AI researchers conducted token-level analysis comparing their 7B-parameter Olmo 3 transformer and Olmo Hybrid models. The study finds hybrid architectures show a loss gap advantage of 0.04 on content words (nouns, verbs, adjectives) versus 0.02 on function words, while transformers match or exceed hybrids on repeated tokens and closing braces.