changelogAnthropic

Anthropic reverts three system changes that degraded Claude Code performance in March and April

TL;DR

Anthropic confirmed three separate system changes in March and April degraded Claude Code, Claude Agent SDK, and Claude Cowork performance. The company reduced default reasoning effort from high to medium on March 4, introduced a caching bug on March 26 that cleared session data with every turn, and added restrictive word limits on April 16 that caused a 3% performance drop.

2 min read
0

Anthropic Reverts Three System Changes That Degraded Claude Code Performance

Anthropic confirmed that Claude Code users who reported declining quality in March and April were correct. The company identified three distinct system changes that degraded performance across Claude Code, Claude Agent SDK, and Claude Cowork. The Claude API was not affected.

Three Separate Issues

Reasoning effort reduction (March 4): Anthropic changed Claude Code's default reasoning effort level from high to medium to reduce latency from extended thinking periods. According to the company, "This was the wrong tradeoff." The change was reverted April 7. The current Claude Code build v2.1.118 now defaults to "xhigh" reasoning effort on Sonnet 4.6.

Caching bug (March 26): Engineers introduced a bug while attempting to clear old output tokens for idle users after one hour. Instead of clearing cached thinking sessions only after idle periods, the bug cleared cached session data with every prompt-response turn. This made Claude "forgetful and repetitive." The issue was fixed April 10 for Sonnet 4.6 and Opus 4.6.

Response length limits (April 16): Anthropic added new system prompt instructions to reduce verbosity: "Length limits: keep text between tool calls to ≤25 words. Keep final responses to ≤100 words unless the task requires more detail." Internal testing before deployment suggested the change was safe, but post-deployment ablation tests revealed a 3% performance drop for both Opus 4.6 and 4.7. The system prompt change was reverted April 20.

Company Response

Anthropic emphasized it did not intentionally degrade its models. The company is implementing additional measures including more internal testing for Claude Code builds, improvements to its Code Review tool, better evaluation of system prompt changes, and a new @ClaudeDevs X account for detailed product communications.

The company reset usage levels for all customers following the issues. Head of growth Amol Avasare separately committed to more direct communication after previously addressing an unannounced A/B test through social media.

What This Means

This incident exposes the fragility of complex AI systems where seemingly minor adjustments cascade into measurable quality degradation. The 3% performance drop from word count restrictions reveals how difficult it is to predict the impact of system prompt changes, even with internal testing. More concerning is that Anthropic's initial testing failed to catch the caching bug and the performance impact of length restrictions—suggesting their pre-deployment evaluation framework needs strengthening. The month-long window between introducing issues and fully reverting them indicates a gap in real-time monitoring systems.

Related Articles

benchmark

Claude Opus 4.8 fails legal reasoning test despite improved honesty scores

Anthropic's Claude Opus 4.8 demonstrated better uncertainty handling than its predecessor in independent testing across coding, medical, and financial scenarios. However, the model exhibited a significant judgment error in a legal reasoning test involving travel insurance claims, according to results published by ZDNET.

model release

Anthropic's Opus 4.8 matches Claude Mythos Preview in alignment, cuts thinking mode costs by 67%

Anthropic released Claude Opus 4.8 on May 28, 2026, replacing Opus 4.7 at unchanged pricing. The company claims the model's misalignment rates match those of Claude Mythos Preview, the experimental model deemed too dangerous for public release in April 2026. Opus 4.8 delivers faster thinking modes at one-third the cost of version 4.7.

model release

Anthropic releases Claude Opus 4.8 with improved agentic coding and reasoning benchmarks

Anthropic released Claude Opus 4.8 on May 28, 2026, with improved performance in agentic coding, computer use, and reasoning benchmarks. Pricing remains at $5 per million input tokens and $25 per million output tokens, while the model's fast mode is now three times cheaper than previous versions.

model release

Anthropic's Claude Opus 4.8 launches on AWS Bedrock in four regions

Anthropic's Claude Opus 4.8 is now available on Amazon Bedrock and Claude Platform on AWS. The model is designed for autonomous multi-stage tasks, agentic coding, and long-running workflows with reduced supervision.

Comments

Loading...