cost-analysis

2 articles tagged with cost-analysis

May 18, 2026

benchmark

IBM Research launches Open Agent Leaderboard, showing same models achieve different results based on agent architecture

IBM Research has launched the Open Agent Leaderboard, the first open benchmark that evaluates complete AI agent systems rather than just underlying models. The leaderboard reveals that agents using identical models can achieve significantly different success rates and costs depending on system architecture, with failed runs costing 20-54% more than successful ones.

May 18, 2026 · 2:20 PM

May 15, 2026

benchmark

Augment Code's agent matches Claude Code quality at 33% lower cost on Opus 4.7

Augment Code benchmarked its Auggie agent against Claude Code on Claude Opus 4.7, reporting a 67.4% pass rate versus 66.3% while cutting costs by 33%. The company attributes savings to a semantic context engine that reduces cache read tokens by 32% and output tokens by 37% compared to Claude Code's keyword-based retrieval.

May 15, 2026 · 8:05 PM

← Back to all news