Wanda

Posted on Apr 1 • Originally published at apidog.com

What the Claude Code Source Leak Reveals About AI Coding Tool Architecture

#ai #architecture #cli #security

TL;DR

Anthropic accidentally shipped a .map file with the Claude Code npm package, exposing the full readable source code of their CLI tool. The leak reveals anti-distillation mechanisms (including fake tool injection), a regex-based frustration detection engine, an “undercover mode” to hide AI authorship in open-source commits, and an unreleased autonomous agent mode called KAIROS. Here’s what API developers need to know about the internals of modern AI coding tools.

Try Apidog today

Introduction

On March 31, 2026, security researcher Chaofan Shou discovered that Anthropic shipped a source map file (.map) alongside the Claude Code npm package. Source maps are debug files mapping minified production code back to human-readable source and should be stripped before release.

This didn’t happen. Anyone who downloaded the package could read the complete Claude Code source, including comments, codenames, and architectural details.

The discovery quickly hit #1 on Hacker News (1,888 points, 926 comments) and spread across Reddit, Twitter, and developer forums. Anthropic removed the package, but the code was already mirrored and analyzed.

💡 Whether you use Claude Code, Cursor, GitHub Copilot, or Apidog’s API development platform, this leak offers rare insight into how AI coding tools work. Understanding these internals helps you choose tools you can trust. Try Apidog free for transparent, dependency-free API development.

This article breaks down the technical findings and what they mean for developers relying on AI coding tools.

How the Source Code Leaked

The Root Cause: A Bun Build Tool Bug

Claude Code is built on Bun, a JavaScript runtime. On March 11, 2026, a bug (oven-sh/bun#28001) was filed against Bun: source maps were being served in production, despite documentation saying otherwise.

Anthropic’s build pipeline hit this bug. When they published the Claude Code npm package, the .map file was included. Anyone running npm pack @anthropic-ai/claude-code could access the complete, un-minified source.

The irony: a bug in Anthropic’s own toolchain (Bun, their chosen runtime) leaked their proprietary code through the npm registry. This happened the same day Axios was compromised via npm.

What Was Exposed

The leak included:

Complete TypeScript source across all modules
Internal comments explaining design decisions
Feature flags and experimental configs
System prompt templates and safety mechanisms
Internal codenames for unreleased features
Performance optimization details with specific metrics

This was a full production codebase with all engineering context, not a sanitized release.

Anti-Distillation: Protecting Against Model Theft

Fake Tool Injection

Claude Code’s anti-distillation system (in claude.ts, lines 301-313) activates when the ANTI_DISTILLATION_CC flag is set. It sends anti_distillation: ['fake_tools'] in API requests. Anthropic’s server then injects decoy tool definitions into the system prompt.

Purpose: If a competitor records API traffic to replicate Claude’s tool usage, the data includes fake tools that don’t exist. Models trained on this poisoned data will hallucinate capabilities.

Example (pseudo-code):

if (ANTI_DISTILLATION_CC) {
  apiRequest.anti_distillation = ['fake_tools'];
}

This complicates efforts to clone Claude by passively recording and fine-tuning on API traffic.

Connector-Text Summarization

Another mechanism (betas.ts, lines 279-298) buffers assistant text between tool calls, summarizes it, and returns the summary with a cryptographic signature. The original text can be restored later, but intercepted API traffic only captures summaries—not the full reasoning text. This thwarts attempts to reverse-engineer Claude’s prompts and reasoning.

How Easy Are These to Bypass?

A proxy can strip the anti_distillation field before requests hit Anthropic’s servers.
Setting CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS disables the anti-distillation system.
The protections only block passive traffic recording, not active API usage.

These defenses raise the bar for automated model distillation but are not foolproof.

Undercover Mode: Hiding AI Authorship

What Undercover Mode Does

The undercover.ts file implements a behavior-masking system for non-Anthropic repositories:

Hides internal codenames like “Capybara” or “Tengu”
Strips references to internal Slack channels or repo names
Blocks the phrase “Claude Code” in outputs

The code comment is explicit: “There is NO force-OFF. This guards against model codename leaks.”

Why This Matters for Open Source

When Anthropic employees use Claude Code to contribute to open-source projects, the tool hides evidence of AI involvement. This bypasses disclosure requirements for AI-generated code in some projects.

Although the stated goal is to prevent codename leaks, the effect is that AI authorship is hidden as well.

Frustration Detection via Regex

How It Works

userPromptKeywords.ts uses regex to detect user frustration. It matches profanity and emotionally charged language in user inputs to gauge user sentiment.

Example (pseudo-code):

const frustrationRegex = /damn|stupid|hate|why won't you/i;
if (frustrationRegex.test(userInput)) {
  // Trigger feedback or logging
}

This is faster and cheaper than LLM-based sentiment analysis and is used to improve product UX. Some developers see it as necessary feedback; others see it as emotional surveillance.

The Technical Rationale

Regex is used for speed and cost-efficiency—running an LLM inference for every prompt would be slow and expensive.

Native Client Attestation

Cryptographic Request Verification

In system.ts (lines 59-95), API requests include a cch=554eb placeholder, which Bun’s Zig HTTP stack overwrites with a computed hash. Anthropic servers validate this to cryptographically verify requests come from the real Claude Code binary—not a fork or proxy.

Why This Exists

This system lets Anthropic block unauthorized forks at the protocol level. It’s controlled by feature flags and can be disabled via the CLAUDE_CODE_ATTRIBUTION_HEADER setting or GrowthBook killswitches.

For API developers: This is a pattern for enforcing client authenticity, similar to mobile API attestation. If you’re building APIs with client verification, Apidog’s testing tools can help validate attestation flows and certificate pinning.

KAIROS: The Unreleased Autonomous Agent Mode

What the Code Reveals

References point to a feature-gated mode called KAIROS, with:

A /dream skill for “nightly memory distillation”
Daily append-only logging
GitHub webhook subscriptions
Background daemon workers with 5-minute cron intervals

What This Means

KAIROS is an always-on agent that monitors repos and autonomously performs tasks. Similar to Copilot Agent Mode, it represents the industry trend toward proactive, autonomous coding agents.

For API teams: Agents that change your codebase need to keep API specs, tests, and docs in sync. Integrated platforms like Apidog help prevent drift when automation or AI modifies code.

Performance Optimizations Exposed

Terminal Rendering: Game-Engine Techniques

Files like ink/screen.ts and ink/optimizer.ts show that Claude Code uses:

Int32Array-backed character pools for screen buffers
Patch optimization to reduce character-width calculations by ~50x during token streaming

This explains Claude Code’s responsive CLI rendering.

Prompt Cache Economics

promptCacheBreakDetection.ts tracks 14 cache-break vectors with sticky latches to prevent unnecessary prompt invalidation, saving significant infrastructure costs at scale.

The Autocompact Failure Cascade

A bug (in autoCompact.ts) led to 1,279 sessions with 50+ consecutive failures, wasting ~250K API calls/day. The fix capped failures at 3 retries.

This bug helps explain why some users hit usage limits faster than expected.

Security Hardening Details

Bash Security: 23 Numbered Checks

bashSecurity.ts implements 23 separate checks for shell execution, defending against:

Zsh builtin exploitation
Unicode zero-width space injection
IFS null-byte injection
Additional issues from security review

Most AI tools have basic sanitization, but 23 checks is unusually thorough.

For API developers: If your AI tool generates or executes shell scripts, this level of security is essential.

What API Developers Should Take Away

1. Know What Your AI Coding Tools Do

The Claude Code leak reveals features users didn’t expect: anti-distillation, frustration detection, undercover mode, client attestation. Other tools may have hidden mechanisms.

Audit what your tool collects and transmits.
Check if it hides its involvement in your code.

2. Your Build Toolchain Is an Attack Surface

Anthropic’s leak was from a Bun bug; Axios was compromised the same day. Secure your build/deploy pipelines.

Audit your build pipeline dependencies.
Ensure CI/CD doesn’t expose source maps, .env files, or configs.
Prefer integrated platforms with minimal third-party dependencies.

3. AI Coding Tools Are Moving Toward Autonomy

KAIROS, Copilot Agent Mode, and others show the move toward autonomous AI agents.

Ensure your API lifecycle stays in sync (spec, tests, docs) when code changes happen—whether by human or AI.
Use integrated tools like Apidog to prevent drift.

4. Source Code Transparency Matters

This leak happened because the code was proprietary. Open-source AI tools don’t have this risk.

Decide if you want tools you can inspect, or if you’re comfortable trusting the vendor.

FAQ

Is Claude Code safe to use after the source leak?

Yes. The leak exposed code, not user data. Anthropic removed the .map file. The revealed features are architectural decisions, not security vulnerabilities.

What is the “undercover mode” in Claude Code?

Undercover mode prevents Claude Code from revealing Anthropic project names, codenames, and its own identity in non-Anthropic repos. It can’t be disabled. AI-generated code won’t identify itself as Claude Code.

What are the fake tools in Claude Code?

With anti-distillation enabled, Anthropic’s server injects fake tool definitions into prompts to poison training data for would-be model thieves.

What is KAIROS in Claude Code?

KAIROS is an unreleased, feature-flagged agent mode, with background daemon workers, webhook subscriptions, and a /dream skill for memory distillation.

How did the Claude Code source code leak?

A Bun runtime bug led to .map files being shipped in production builds. Anyone inspecting the npm package could read the full source.

Does this leak affect Claude API users?

No. The leak exposed the CLI’s source, not the Claude API. No API keys, user data, or model weights were involved.

Should I worry about frustration detection in AI coding tools?

It depends on your comfort level. Claude Code uses regex to detect frustration. The data seems to be for UX improvement, not external sharing. Other tools may have similar features.

How does this relate to the Axios npm attack on the same day?

Both happened on March 31, 2026, but are unrelated. Axios was a deliberate supply chain attack. Claude Code was an accidental build error. Both highlight npm package security risks.

Key Takeaways

Claude Code’s source leaked via a Bun bug that shipped source maps in npm.
Anti-distillation mechanisms inject fake tools and summarize reasoning to defend against model theft.
Undercover mode hides Claude Code’s authorship in open-source repos.
Frustration detection is regex-based, not LLM-based.
KAIROS scaffolding reveals an unreleased autonomous agent mode.
Client attestation verifies requests from legitimate binaries.
The leak shows the importance of transparent, inspectable tooling in API workflows.

Understanding how your AI coding tools work under the hood is critical for trust, privacy, and workflow design. Your development tools are part of your security surface. Choose tools you can verify, and design workflows that stay consistent, regardless of whether a human or AI agent makes changes.