agentic-systems

2 articles tagged with agentic-systems

April 12, 2026

AI agent skills fail in real-world conditions, researchers find testing 34,000 skills

A large-scale study testing 34,198 real-world skills reveals that AI agent performance drops drastically when moving from curated benchmarks to realistic conditions. Claude Opus 4.6 saw pass rates fall from 55.4% with hand-selected skills to 38.4% in truly realistic scenarios, while weaker models like Kimi K2.5 actually perform below their no-skill baseline.

April 12, 2026 · 10:50 AM

March 9, 2026

product updateGitHub

GitHub details security architecture for Agentic Workflows in Actions

GitHub has published technical details on the security architecture underlying its Agentic Workflows feature, which runs AI agents within GitHub Actions. The system implements process isolation, output constraints, and comprehensive audit logging to contain agent behavior.

March 9, 2026 · 4:05 PM

← Back to all news