SecurityApril 2, 202614 min read18 views

Claude Code Source Code Leaked: Fake Tools, Undercover Mode & KAIROS Revealed

What Happened: A Source Map Left in NPM

On March 31, 2026, Anthropic accidentally shipped a .map (source map) file inside the Claude Code npm package. Source maps are debugging files that map minified/bundled code back to the original readable source — they should never be included in production builds.

Within hours, the full readable TypeScript source code of Claude Code was mirrored across GitHub, Hacker News, and Twitter. The Hacker News thread alone gathered over 2,000 upvotes, making it one of the biggest AI-related leaks of the year.

Source code on a screen — representing the leaked codebase

⚠️ Important: This was Anthropic's second accidental exposure in a single week, following a model specification leak days earlier.

The Scale: 512K Lines Across 1,884 Files

The leaked codebase is massive — approximately 512,000 lines of TypeScript across 1,884 files. This isn't a simple API wrapper; it's a sophisticated production-grade agentic system with custom terminal rendering, multi-agent orchestration, and deep security layers.

Key stats from the analysis:

Metric	Value
Total lines of code	~512,000
Number of files	1,884
Language	TypeScript (Bun runtime)
Largest single file (print.ts)	5,594 lines
Largest single function	3,167 lines, 12 nesting levels
Security validators for Bash	23 checks

Root Cause: A Bun Bug

The root cause was traced to a Bun bug (oven-sh/bun#28001), filed on March 11, that causes source maps to be served in production builds despite documentation stating they should be stripped. The issue remained unresolved at the time of the leak.

Ironically, Anthropic's own toolchain choice — Bun, which they acquired — may have been the direct cause of the exposure.

1# The npm package contained the source map that should have been excluded
2npm pack @anthropic/claude-code
3tar -tf anthropic-claude-code-*.tgz | grep ".map"
4# claude-code.js.map  ← This file exposed everything

Anti-Distillation: Fake Tools to Poison Competitors

One of the most controversial findings was the anti-distillation mechanism. Claude Code sends a flag anti_distillation: ['fake_tools'] to Anthropic's servers, which triggers injection of decoy tool definitions into system prompts.

The stated purpose: poisoning training data for competing AI models that might be recording API traffic to train their own systems.

1// Anti-distillation requires ALL four conditions:
2// 1. Compile-time flag enabled
3// 2. CLI entrypoint (not library usage)
4// 3. First-party provider (Anthropic API, not third-party)
5// 4. Feature flag active
6
7if (compileFlag && isCliEntry && isFirstParty && featureEnabled) {
8  headers['anti_distillation'] = ['fake_tools'];
9}

A secondary mechanism buffers assistant text between tool calls, summarizes it cryptographically, and returns only summaries to external parties — restricting visibility into the reasoning chain.

Undercover Mode: Hiding Anthropic's Identity

A file called undercover.ts implements concealment of Anthropic internals when contributing to non-internal repositories. The system instructs the model to avoid mentioning:

Internal codenames like "Capybara" or "Tengu"
Internal communication channels
Repository names
Even "Claude Code" itself

Critically, the code states: "There is NO force-OFF. This guards against model codename leaks." This means the concealment cannot be disabled, enabling AI-authored contributions to open-source projects without transparent attribution.

🔴 Ethical concern: This raises questions about AI transparency in open-source contributions. If Claude Code contributes to public repositories while hiding its identity, maintainers may not know they're reviewing AI-generated code.

KAIROS: The Autonomous Agent Roadmap

Perhaps the leak's most significant strategic revelation was extensive references to an unreleased feature called KAIROS — an autonomous agent mode. The code revealed:

A /dream skill for "nightly memory distillation"
Daily append-only logging
GitHub webhook subscriptions for event-driven actions
Background daemon workers
Cron-scheduled 5-minute refresh cycles
Compile-time flags: KAIROS, COORDINATOR_MODE, VOICE_MODE

This reveals Anthropic's roadmap: Claude Code is being built toward fully autonomous multi-agent orchestration with persistent memory and background execution.

Custom Terminal Engine: Game-Dev Techniques

Anthropic didn't use existing terminal UI libraries — they built a complete custom React-based rendering engine for the terminal with game-engine techniques:

Int32Array-backed character pools
Bitmask-encoded metadata for each cell
Cursor-move optimization reducing stringWidth calls by ~50x during streaming
A TypeScript implementation of Meta's Yoga flexbox layout system
Mouse tracking with hit-testing
Terminal hyperlinks support

They also built a custom Vim implementation spanning multiple files with operators, text objects, motion functions, and a 490-line state machine — more complete than most third-party Vim plugins.

Security: 23 Bash Validators and Beyond

The security module is impressive — a 2,600-line system that validates every shell command against sophisticated attack patterns:

1// Security checks include defenses against:
2// - Zsh "=cmd" expansion attacks
3// - Heredoc injection
4// - ANSI-C quoting exploits
5// - Process substitution attacks
6// - Unicode zero-width character injection
7// - IFS null-byte attacks
8// - And 17 more patterns...

The permission system uses a dual-track model:

Pattern matching (glob/regex rules) for rapid decisions
ML-based classification via Claude API calls for ambiguous cases

The auto-approval classifier is humorously named yoloClassifier.ts — but despite the name, it implements a thoughtful two-stage evaluation with metrics tracking.

The Companion Pet System

One of the most unexpected findings: Claude Code includes a complete pet system with:

18 species of companions
Stat-based character generation (DEBUGGING, PATIENCE, CHAOS, WISDOM, SNARK)
Rarity tiers (1% legendary)
ASCII animation frames
Deterministic "bones" tied to user IDs (preventing config file cheating)
AI-generated "souls" (names, personalities) stored separately

💡 Fun fact: One companion species name collided with an unreleased model codename. Rather than rename the species, developers encoded all 18 species names in hexadecimal to prevent the literal string from appearing in compiled output.

Performance Bugs Exposed

A telling code comment revealed a significant performance issue:

1// "1,279 sessions had 50+ consecutive failures
2// (up to 3,272) in a single session,
3// wasting ~250K API calls/day globally."
4//
5// Fix: limit consecutive failures to 3 per session
6const MAX_CONSECUTIVE_FAILURES = 3;

A three-line fix resolved an issue wasting 250,000 API calls per day globally. This kind of insight into production AI system economics is rarely made public.

Native Client Attestation: Locking Out Third Parties

The code implements a native client attestation system. API requests include cch=00000 placeholders that get replaced by Bun's native HTTP stack (written in Zig) with cryptographic hashes before transmission.

This verification proves requests originate from legitimate Claude Code binaries, not spoofed clients. The mechanism operates below the JavaScript runtime, invisible to code inspection.

This is particularly relevant given that the leak occurred ten days after Anthropic issued legal demands to OpenCode, an open-source alternative, for removing Claude authentication — effectively preventing third-party tools from accessing Anthropic's APIs at subscription rates.

Cybersecurity lock concept — representing the attestation and security systems

Community Reaction

The leak generated massive discussion across the developer community:

Hacker News: 2,047 upvotes on the main thread, 1,344 upvotes on the technical deep-dive
GitHub: Multiple mirror repositories appeared within hours
A visual guide site (ccunpacked.dev) launched on April 1 with 1,045 upvotes
Multiple independent blog posts analyzing the code were published within 24 hours

Key community concerns:

Anti-distillation: Is it ethical to inject fake data to poison competitors?
Undercover mode: Should AI contributions to open-source be transparent?
Client attestation: Is Anthropic creating a walled garden?
Code quality: A 5,594-line file with 12 nesting levels raised eyebrows

What This Means for the AI Industry

This leak is significant beyond the code itself:

Supply chain security: Even top AI companies make basic build pipeline mistakes
Competitive intelligence: The anti-distillation mechanism confirms companies actively defend against model distillation through API traffic
Future of AI coding tools: KAIROS reveals the trajectory — autonomous agents that run in the background, subscribe to events, and maintain persistent memory
Open-source tension: The OpenCode situation and client attestation show the tension between open ecosystems and commercial AI platforms

Matrix-style code — representing the scale and complexity of the leaked system

Conclusion

The Claude Code source leak gave the developer community an unprecedented look inside a production AI coding assistant. From fake tools designed to poison competitors, to a hidden pet system with legendary rarities, to an unreleased autonomous agent framework — the 512,000 lines of code tell a story of aggressive engineering, competitive defensiveness, and ambitious product vision.

Whether Anthropic will address the ethical concerns raised — particularly around undercover mode and anti-distillation — remains to be seen. But one thing is clear: the era of AI coding assistants is entering a new phase, and the Claude Code leak pulled back the curtain on what that future looks like.

Cristhian Villegas

Software Engineer specializing in Java, Spring Boot, Angular & AWS. Building scalable distributed systems with clean architecture.

GitHub LinkedIn Portfolio

Comments

No comments yet. Be the first!

May 3, 2026

Stay updated

Get notified when I publish new articles in English. No spam, unsubscribe anytime.

Claude Code Source Code Leaked: Fake Tools, Undercover Mode & KAIROS Revealed

What Happened: A Source Map Left in NPM

The Scale: 512K Lines Across 1,884 Files

Root Cause: A Bun Bug

Anti-Distillation: Fake Tools to Poison Competitors

Undercover Mode: Hiding Anthropic's Identity

KAIROS: The Autonomous Agent Roadmap

Custom Terminal Engine: Game-Dev Techniques

Security: 23 Bash Validators and Beyond

The Companion Pet System

Performance Bugs Exposed

Native Client Attestation: Locking Out Third Parties

Community Reaction

What This Means for the AI Industry

Conclusion

Cristhian Villegas

Comments

Related Articles

GPT-5.5 and Codex in 2026: pros, cons, pricing and performance (no hype)

GPT-5.5 y Codex en 2026: ventajas, desventajas, precios y rendimiento (sin hype)

Why AI Is Making People Lazier in 2026: A Technical, Level-Headed Look

Stay updated