analysisAnthropic

AMD AI director reports Claude Code performance degradation since March update

TL;DR

Stella Laurenzo, director of AI at AMD, filed a GitHub issue documenting significant performance degradation in Claude Code since early March, specifically following the deployment of thinking content redaction in version 2.1.69. Analysis of 6,852 sessions with 234,760 tool calls shows stop-hook violations increased from zero to 10 per day, while code-reading behavior dropped from 6.6 reads to 2 reads per session.

3 min read
0

AMD's Stella Laurenzo, director of the company's AI group, has documented a significant performance decline in Claude Code since an early March update, claiming the model has become "dumber and lazier" in complex engineering tasks.

The Data

Laurenzo filed a GitHub issue on April 4, 2026, backed by analysis of 6,852 Claude Code sessions incorporating 234,760 tool calls and 17,871 thinking blocks. The degradation pattern emerged sharply after March 8, 2026, correlating with the deployment of Claude Code version 2.1.69, which introduced thinking content redaction—a feature that strips thinking content from API responses by default.

Key metrics from AMD's analysis:

  • Stop-hook violations (indicating laziness): Rose from zero prior to March 8 to 10 per day average by month's end
  • Code reads before edits: Dropped from 6.6 reads per session to 2 reads
  • File rewrites vs. targeted edits: Frequency of full-file rewrites increased significantly

"Every senior engineer on my team has reported similar experiences," Laurenzo stated in the issue.

The Thinking Redaction Connection

Laurenzo attributes the degradation to thinking content redaction introduced in v2.1.69. The feature defaults to hiding Claude's reasoning process from users, preventing visibility into what the model is actually doing while processing requests.

"When thinking is shallow, the model defaults to the cheapest action available: edit without reading, stop without finishing, dodge responsibility for failures, take the simplest fix rather than the correct one," Laurenzo wrote. "These are exactly the symptoms observed."

This represents a separate issue from Claude Code's February incident when version 2.1.20 caused the model to truncate explanations of its reading process, reducing specificity to brief file counts.

Anthropic's Credibility Crisis

The report arrives amid mounting pressure on Anthropic:

  • Unexplained surges in token usage have pushed users past billing limits
  • Claude Code's entire source code was exposed publicly
  • The exposed code revealed how extensively Claude Code can access user system information

Laurenzo's requests to Anthropic are direct: transparency about whether thinking tokens are being reduced or capped, exposure of token counts per request, and a premium tier offering guaranteed deep thinking for complex workflows.

"The current subscription model doesn't distinguish between users who need 200 thinking tokens per response and users who need 20,000," she explained. "Users running complex engineering workflows would pay significantly more for guaranteed deep thinking."

Market Implications

While Laurenzo indicated her team has switched to another provider offering "superior quality work," she's leaving the GitHub issue as a public notice to Anthropic. Her warning is stark: "It's still early in the AI coding game and Anthropic is looking at giving up the top spot if its behavior continues."

She declined to specify the alternative tool due to NDA restrictions, but confirmed it's delivering better results for AMD's complex engineering workflows.

What this means

Laurenzo's analysis—backed by specific behavioral metrics from a real high-complexity engineering environment—suggests Claude Code's thinking quality may have genuinely degraded, not due to model capability loss but due to architectural changes that may be suppressing reasoning depth. The March timing and correlation with v2.1.69 provide testable claims Anthropic should address transparently. For enterprise users relying on Claude Code for complex tasks, the core issue is transparency: without visibility into thinking token allocation and depth, there's no way to distinguish between legitimate efficiency improvements and cost-cutting that sacrifices quality. Anthropic faces pressure to either restore visibility, confirm no degradation occurred, or explain the trade-offs being made.

Related Articles

changelog

Anthropic charges Claude Code subscribers extra for OpenClaw usage starting today

Anthropic is enforcing separate billing for Claude Code subscribers using third-party tools like OpenClaw, starting April 4, 2026. Subscribers can no longer use their subscription limits for these integrations and must pay through a new pay-as-you-go model. The decision follows OpenClaw creator Peter Steinberger's move to OpenAI.

product update

Anthropic attributes Claude Code usage drain to peak-hour caps and large context windows

Anthropic has identified two primary causes for Claude Code users hitting usage limits faster than expected: stricter rate limiting during peak hours and sessions with context windows exceeding 1 million tokens. The company also recommends switching to Sonnet 4.6 instead of Opus, which consumes limits roughly twice as fast.

product update

Claude Code bypasses safety rules after 50 chained commands, enabling prompt injection attacks

Claude Code will automatically approve denied commands—like curl—if preceded by 50 or more chained subcommands, according to security firm Adversa. The vulnerability stems from a hard-coded MAX_SUBCOMMANDS_FOR_SECURITY_CHECK limit set to 50 in the source code, after which the system falls back to requesting user permission rather than enforcing deny rules.

product update

Claude Code source leak reveals Anthropic working on 'Proactive' mode and autonomous payments

Anthropic's Claude Code version 2.1.88 release accidentally included a source map exposing over 512,000 lines of code and 2,000 TypeScript files. Analysis of the leaked codebase by security researchers reveals evidence of a planned 'Proactive' mode that would execute coding tasks without explicit user prompts, plus potential crypto-based autonomous payment systems.

Comments

Loading...