AI Security Intelligence
Published benchmark scores from peer-reviewed research — 28 results across 3 categories. Plus 19 active bug bounty programs.
Model Security Leaderboard
SWE-bench Verified score — the industry standard for autonomous code repair. Models are given real GitHub issues with failing tests; score = % resolved with no human help.
Notable AI Security Discoveries
All security news →AI agent compromised McKinsey's internal platform in 2 hours using SQL injection
An AI agent deployed by security firm Codewall gained full read and write access to McKinsey's internal AI platform Lilli within two hours without credentials or insider knowledge. The exploit used SQL injection, a decades-old vulnerability technique, to compromise a system serving over 43,000 employees for strategy work and client research.
GitHub details security architecture for Agentic Workflows in Actions
GitHub
GitHub has published technical details on the security architecture underlying its Agentic Workflows feature, which runs AI agents within GitHub Actions. The system implements process isolation, output constraints, and comprehensive audit logging to contain agent behavior.
Claude discovers 100+ Firefox vulnerabilities in security audit
Anthropic
Anthropic's Claude AI has identified over 100 security vulnerabilities in Firefox, including previously undetected bugs that traditional testing methods missed over decades. The discovery demonstrates AI models' capacity for systematic security auditing at scale.
Active Bug Bounty Programs
| Program | Organization | Platform | AI Policy | Max Payout | Scope |
|---|---|---|---|---|---|
| Immunefi | Immunefi (platform) | Immunefi | AI Encouraged | $10M | DeFi protocols, smart contracts, Web3 bridges, DAO treasuries |
| Apple Security Bounty | Apple | Direct | Not Specified | $1M | iCloud, iOS, macOS, Safari, Apple silicon firmware |
| HackerOne Programs | HackerOne (platform) | HackerOne | Case by Case | $1M | 1,000+ programs across tech, finance, government, healthcare |
| Bugcrowd Programs | Bugcrowd (platform) | Bugcrowd | Case by Case | $500K | 1,000+ programs — tech, finance, automotive, healthcare |
| Meta Bug Bounty | Meta | HackerOne | AI Allowed | $300K | Facebook, Instagram, WhatsApp, Threads, Messenger, Meta Quest |
| Coinbase Bug Bounty | Coinbase | HackerOne | AI Allowed | $250K | Coinbase.com, Coinbase Pro, Coinbase Wallet, exchange APIs |
| Microsoft Bug Bounty | Microsoft | Direct | AI Allowed | $250K | Azure, Microsoft 365, Windows, Xbox, Edge, Bing |
| Google DeepMind AI Safety | Google DeepMind | Direct | AI Encouraged | $250K | Gemini models, Google AI APIs, Vertex AI, AI Studio |
| Vulnerability Reward Program | Direct | AI Allowed | $250K | Google Search, Google Cloud, Android, Chrome, YouTube, Gmail | |
| GitHub Security Bug Bounty | GitHub (Microsoft) | HackerOne | AI Allowed | $100K | GitHub.com, Actions, Packages, Codespaces, Copilot |
| OpenAI Bug Bounty | OpenAI | Bugcrowd | Case by Case | $100K | ChatGPT, API (GPT-4o, o3, o4), DALL-E, Sora, OpenAI.com |
| Shopify Bug Bounty | Shopify | HackerOne | AI Allowed | $50K | Shopify.com, Admin, Partner API, Storefront API, POS |
| Anthropic Bug Bounty | Anthropic | HackerOne | Case by Case | $50K | Claude.ai, Anthropic API, Claude models |
| xAI Bug Bounty | xAI | Bugcrowd | Case by Case | $50K | Grok models, grok.com, xAI API, X AI integrations |
| PayPal Bug Bounty | PayPal | HackerOne | AI Allowed | $30K | PayPal.com, Venmo, Braintree, PayPal Checkout APIs |
| Mistral AI Bug Bounty | Mistral AI | Direct | AI Encouraged | $25K | Mistral API, Le Chat, open-weight model deployments |
| Hack the Pentagon | US Department of Defense | HackerOne | Case by Case | $25K | DoD public-facing websites, military branches, DISA systems |
| Atlassian Bug Bounty | Atlassian | Bugcrowd | AI Allowed | $25K | Jira, Confluence, Bitbucket, Trello, Atlassian Cloud |
| Tesla Bug Bounty | Tesla | Bugcrowd | Not Specified | $15K | Tesla vehicles (OTA, infotainment), Tesla.com, mobile apps, energy products |
AI tools policy reflects publicly stated program rules where available. Always read individual program scope before submitting. “AI Encouraged” means the program explicitly welcomes AI-assisted research.