Skip to content

Anthropic fixes Claude Code's performance slump after months of glitches

Months of frustration for Claude Code users end as Anthropic rolls out fixes—and new transparency efforts. Will stricter testing prevent future AI stumbles?

The image shows a drawing of a machine with a lot of pipes and numbers on it. At the top and bottom...
The image shows a drawing of a machine with a lot of pipes and numbers on it. At the top and bottom of the image, there is text which reads "Calculation of a Compute".

In Brief

  • Following user complaints about declining quality in Claude Code, Anthropic has fixed three separate issues affecting reasoning depth adjustments, caching, and text length limits.
  • To prevent future incidents, the company is tightening internal testing before new updates. As compensation for the disruptions, all subscriber usage limits have been reset.
  • The problems highlight a broader industry-wide shortage of computing power, leading to more frequent outages and forcing AI providers to raise prices for resource-intensive tools.

Details

Anthropic fixes Claude Code's performance slump after months of glitches

Users reported a noticeable decline in the performance of Claude Code, Anthropic's coding tool. The company has now identified and resolved three distinct sources of error, pledging stricter quality controls moving forward.

Over the past month, users increasingly complained that Claude Code was delivering significantly worse results. In a detailed post-mortem, Anthropic revealed that three independent changes—affecting Claude Code, the Claude Agent SDK, and Claude Cowork—combined to create a widely perceived drop in quality. The core API itself remained unaffected, according to the company. All three issues were fixed as of April 20 with version 2.1.116.

"We take reports of performance degradation very seriously. We never intentionally degrade our models," the company stated.

Reduced Reasoning Effort, Cache Glitches, and Prompt Constraints

The first issue dates back to March 4, when Anthropic lowered the default reasoning effort from "high" to "medium" after some users experienced excessive latency in high mode. Internal tests had suggested that the medium setting produced only slightly inferior results with significantly lower latency. The change backfired: users quickly reported that Claude Code felt less capable. On April 7, Anthropic reversed the adjustment.

The second problem stemmed from a bug in a caching optimization introduced on March 26. The update was meant to clear older reasoning segments after one hour of inactivity to reduce latency when resuming work. However, a flaw in the implementation caused the reasoning history to be wiped with every subsequent interaction. Claude progressively lost context for its own decisions, leading to forgetfulness, repetitions, and erratic tool selection. The resulting cache misses also depleted usage limits faster than expected. Despite code reviews, unit tests, and internal dogfooding, the bug went undetected until April 10.

A third issue emerged on April 16, when a system prompt instruction intended to curb Opus 4.7's verbosity was implemented. The directive read: "Length limits: keep text between tool calls to ≤25 words. Keep final responses to ≤100 words unless the task requires more detail." Later evaluations using a broader test suite revealed a 3% drop in output quality. The change was rolled back on April 20.

Anthropic Tightens Quality Assurance

Because each change affected different user groups at different times, the cumulative effect resembled a vague, gradual decline—initially difficult to distinguish from normal performance fluctuations.

In response, Anthropic will require more employees to use the exact public build of Claude Code rather than internal test versions. Every system prompt modification must now pass a comprehensive, model-specific evaluation suite.

For changes that could impact performance, the company plans to introduce soak periods and staged rollouts. As compensation, all subscribers have had their usage limits reset.

Anthropic has also launched the @ClaudeDevs X account to improve transparency around product decisions.

Not the First Time: Perceived Degradation a Recurring Issue

This isn't the first instance of users complaining about declining AI performance. As early as the second half of 2023, OpenAI faced accusations that GPT-4 had grown "dumber" over time—a claim the company denied, asserting it had not made significant post-release modifications to the model.

Claude has faced similar complaints in the past, with infrastructure bugs previously to blame. The latest incident underscores a recurring pattern: what users perceive as model regression often stems not from the AI itself but from changes in the tooling layer or underlying infrastructure. In practice, users benefit from frameworks like Claude Code, which help steer model capabilities and provide the right context. But when these frameworks malfunction, the opposite effect occurs—compounded by adjustments from the developers, such as Anthropic's recent tweaks to reasoning depth.

The root cause of such interventions is increasingly tied to industry-wide resource constraints. According to a Wall Street Journal report, Anthropic's API availability recently dipped to just 98.95%, well below the cloud industry's 99.99% standard. Spot-market GPU prices surged 48% per the Ornn Compute Price Index, and Bank of America analysts project demand will outstrip supply through at least 2029. In response, OpenAI has shelved its video-generation tool Sora to free up compute capacity for coding and enterprise products, while GitHub temporarily halted new sign-ups for several Copilot tiers.

Under this pressure, pricing models are also becoming unstable. Anthropic's head of growth recently admitted that its existing Pro and Max plans "weren't designed for this level of usage," having been introduced before compute-heavy tools like Claude Code. The company briefly tested removing Claude Code for new Pro subscribers but backtracked after backlash.

Meanwhile, OpenAI doubled API prices with GPT-5.5, charging $5 per million input tokens and $30 per million output tokens—up from its predecessor's rates. The era of affordable flat-rate access to cutting-edge agentic AI tools appears to be drawing to a close.

Read also: