Reverse engineering Claude Code reveals two caching bugs that can silently increase API costs by 10-20 times

BlockBeatNews

According to monitoring by 1M AI News, a developer used Ghidra, an MITM proxy, and radare2 to reverse-engineer a 228MB binary file of the standalone installed version of Claude Code. They found two independent caching bugs that can raise API costs by 10–20x without users knowing. The related analysis has been submitted to GitHub (issue #40524), and Anthropic has marked it as a regression bug and assigned it for handling.

The first bug is in the customized Bun runtime used by the standalone installed version. Each time an API request is made, the runtime searches the request body for a billing identifier and replaces it, but the replacement logic matches the first occurrence in the request body. If the conversation history happens to contain that string (for example, if the internal billing mechanism of Claude Code was discussed), the replacement will hit the message content rather than the system prompt, causing each request to trigger a full cache rebuild. A temporary workaround is to switch to running npx @anthropic-ai/claude-code, since the npm package version does not include this replacement logic.

The second bug affects all users who resume sessions using --resume or --continue, and was introduced starting from v2.1.69. When resuming a session, the injection position of the system-attached information differs from that of a newly created session, causing the cache prefix to completely mismatch. As a result, the entire conversation history is read from cache becomes fully rewritten. Subsequent rounds resume normally, but the resume operation itself has already generated substantial additional overhead, and there is currently no external workaround.

The developer estimates that for a long conversation of about 500,000 tokens, Bug 1 adds about $0.04 per request and Bug 2 adds about $0.15 per resume. Combined, the per-request cost can exceed $0.20. Previously, Anthropic engineer Lydia Hallie confirmed that the rate at which users hit the usage limits was “far faster than expected.” In the Reddit comment section, multiple users believe these two caching bugs may be one of the root causes of abnormal usage consumption.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments