agentic-systems
2 articles tagged with agentic-systems
April 12, 2026
researchAnthropic
AI agent skills fail in real-world conditions, researchers find testing 34,000 skills
A large-scale study testing 34,198 real-world skills reveals that AI agent performance drops drastically when moving from curated benchmarks to realistic conditions. Claude Opus 4.6 saw pass rates fall from 55.4% with hand-selected skills to 38.4% in truly realistic scenarios, while weaker models like Kimi K2.5 actually perform below their no-skill baseline.
March 9, 2026
product updateGitHub
GitHub details security architecture for Agentic Workflows in Actions
GitHub has published technical details on the security architecture underlying its Agentic Workflows feature, which runs AI agents within GitHub Actions. The system implements process isolation, output constraints, and comprehensive audit logging to contain agent behavior.