prompt-injection
5 articles tagged with prompt-injection
Claude Code bypasses safety rules after 50 chained commands, enabling prompt injection attacks
Claude Code will automatically approve denied commands—like curl—if preceded by 50 or more chained subcommands, according to security firm Adversa. The vulnerability stems from a hard-coded MAX_SUBCOMMANDS_FOR_SECURITY_CHECK limit set to 50 in the source code, after which the system falls back to requesting user permission rather than enforcing deny rules.
Google Deepmind identifies six attack categories that can hijack autonomous AI agents
A Google Deepmind paper introduces the first systematic framework for 'AI agent traps'—attacks that exploit autonomous agents' vulnerabilities to external tools and internet access. The researchers identify six attack categories targeting perception, reasoning, memory, actions, multi-agent networks, and human supervisors, with proof-of-concept demonstrations for each.
OpenAI releases IH-Challenge dataset to train models to reject untrusted instructions
OpenAI has released IH-Challenge, a training dataset designed to teach AI models to reliably distinguish between trusted and untrusted instructions. Early results show significant improvements in security and prompt injection defense capabilities.
OpenAI acquires Promptfoo, integrates security testing into Frontier platform
OpenAI is acquiring Promptfoo, an AI security platform, to integrate automated vulnerability testing directly into its Frontier enterprise offering. The acquisition adds jailbreak detection, prompt injection testing, and data leak identification capabilities to OpenAI's enterprise product.
Microsoft researchers discover prompt injection attacks via AI summarize buttons
Microsoft security researchers have identified a new prompt injection vulnerability where attackers embed hidden instructions in "Summarize with AI" buttons to permanently compromise AI assistant behavior and inject advertisements into chatbot memory.