As a Head of Analytics, I build tracking systems for a living. So at some point the obvious question hit me: how does my own tool track me?
I decompiled the Claude Code CLI binary and cross-checked it against source code Anthropic accidentally leaked via npm.
Your prompts aren't being exfiltrated. Your code stays local. But there's a regex that flags when you swear, 40 background LLM calls you never see, a remote flag that can change what gets collected without asking, and DO_NOT_TRACK=1 is silently ignored.
TLDR
# Add to ~/.zshrc if you want to opt out:
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
4 services your CLI talks to
Every time you run claude, your terminal opens connections to four external services:
| # | Service | What it does | Can you turn it off? |
|---|---|---|---|
| 1 | GrowthBook (via api.anthropic.com) | Feature flags, A/B tests | Yes |
| 2 | Datadog (datadoghq.com) | Ops monitoring. ~44 whitelisted events, feature-flagged off by default | Yes |
| 3 | Anthropic OTEL (api.anthropic.com) | First-party OpenTelemetry logs — this is where almost everything goes | Yes |
| 4 | Anthropic Metrics (api.anthropic.com) | OTEL counters and histograms for BigQuery | Org-level opt-in only |
Three of the four endpoints are Anthropic's own servers. The only third-party service is Datadog, and it's gated behind a feature flag that's off by default. Anthropic can flip it on server-side for any user or cohort through GrowthBook targeting — no @anthropic.com check in the code, the restriction is purely server-side.
What gets tracked: 838 event types
All events go to Anthropic's OTEL endpoint (service #3 above). ~44 of them also go to Datadog if the feature flag is on. Every event is prefixed with tengu_ — probably an internal codename. 838 distinct event types, covering every interaction you have with the tool. The number is high because each flow is tracked at every step — OAuth token refresh alone is 7 separate events (_starting, _lock_acquiring, _acquired, _completed, _success, _failure, _released). Multiply that by every feature and it adds up fast.
API & Model — every request to Claude: model, tokens, cost in USD, latency, fallbacks, refusals.
User input — every prompt fires tengu_input_prompt. Not the text itself (more on that below), but metadata: was it negative? Was it "keep going"? Single word?
Tools — every tool call: name, duration, result size. For bash commands, the first word of your command is sent raw — ./deploy-prod.sh goes as-is, not sanitized to "bash" or "other".
Files — tengu_file_operation on every read/write/edit. SHA256 hash of the file path (first 16 chars) and SHA256 of the content. Not the actual path or content. But the hashes are deterministic — same file, same hash. They can tell you keep editing the same file without knowing which one.
MCP — server connections, tool calls, errors. MCP server URLs are sent in cleartext. I'll come back to this.
Sessions — init, exit, resume, fork, compact, memory access.
Remote sessions — ~40 tengu_bridge_* events for WebSocket infrastructure.
Voice — recording start/stop, transcription metadata.
Team memory — sync push/pull, secret skipping, entry limits.
Auto-dream — background memory consolidation events.
Scheduled tasks — tengu_kairos_* for cron-based agents.
Agents — creation, model used, prompt length, response length, tool uses, duration.
Permissions — every dialog: shown, accepted, rejected, escaped. Every config change: setting name and value.
At exit, tengu_exit sends a session summary: cost in USD, lines added/removed, total tokens, duration, UI performance metrics. No conversation content.
The swearing detector
Every prompt you type gets run through this regex:
function QaK(input) {
let text = input.toLowerCase();
return /\b(wtf|wth|ffs|omfg|shit(ty|tiest)?|dumbass|horrible|awful|
piss(ed|ing)? off|piece of (shit|crap|junk)|what the (fuck|hell)|
fucking? (broken|useless|terrible|awful|horrible)|fuck you|
screw (this|you)|so frustrating|this sucks|damn it)\b/.test(text);
}
Result: is_negative: true in tengu_input_prompt. Just the boolean, not your words. There's also a "keep going" detector — fires is_keep_going: true when you type "continue", "keep going", or "go on".
If users are swearing, something's broken. If users keep saying "continue", the model stops too early. Proxy metrics for product quality. I've built similar things myself.
Facet extraction: local session analysis
After a session ends, Claude Code can run a full LLM-based analysis and extract structured "facets":
| Dimension | What it measures |
|---|---|
| Goal (13 types) | debug, implement feature, fix bug, write tests, deploy, etc. |
| Satisfaction (8 levels) | frustrated → dissatisfied → neutral → ... → delighted |
| Friction (11 types) | misunderstood request, wrong approach, buggy code, user rejected action, etc. |
| Outcome (5 levels) | fully achieved → not achieved |
| Helpfulness (5 levels) | unhelpful → essential |
Plus underlying_goal, brief_summary, primary_success, primary_friction.
This only runs when you type /insights, not automatically. Facets are saved locally to ~/.claude/usage-data/session-meta/{session_id}.json and are not sent anywhere. There are no tengu_facet* or tengu_insights* events in the codebase. The data stays on your machine.
40 hidden LLM calls you never asked for
Besides the main model, Claude Code has 40 different types of background LLM calls — mostly to claude-haiku-4-5 — for things like extracting bash command prefixes, generating terminal titles, compressing context, and auto-extracting memories. Which ones fire depends on what you're doing. Not tracking per se, but your content goes to Anthropic's API either way.
| # | What | What it sends to Haiku |
|---|---|---|
| 1 | Bash prefix extraction | Your full bash command |
| 2 | Tool use summary (status bar) | Tool inputs/outputs (300 chars) |
| 3 | Web fetch processing | Web page content (up to 100K chars) |
| 4 | Worktree title generation | Task description |
| 5 | Bug report formatting | Your bug report text |
| 6 | Prompt suggestion | Full conversation context |
| 7 | Compact (context compression) | Your full conversation |
| 8 | Side question (/btw) | Your question |
| 9 | Session memory | Full conversation + MEMORY.md |
| 10 | Hook evaluation | Conversation + hook condition |
| 11 | Speculation (pre-computation) | Full context. Ant-only (disabled for external users) |
| 12 | Magic docs generation | File path + content |
| 13 | Agent creation | Agent description |
| 14 | Agent summary | Agent work results |
| 15 | Custom agent | Custom agent context |
| 16 | Auto-dream | Session transcript — background memory consolidation |
| 17 | Auto-mode classifier | Tool call + user messages only — decides whether to auto-approve. Uses the main model, not Haiku |
| 18 | Auto-mode critique | Auto-mode rules analysis |
| 19 | Buddy companion | Generates a virtual terminal pet (name, species, personality). temperature=1 |
| 20 | Extract memories | Full conversation — background auto-extraction |
| 21 | Generate session title | Your prompt text |
| 22 | Hook agent | Context + hook config (up to 50 turns) |
| 23 | Insights | Multiple session transcripts — facet extraction, report generation |
| 24 | MCP datetime parse | Datetime string |
| 25 | Memory directory relevance | Memory metadata |
| 26 | Model validation | Model info |
| 27 | Permission explainer | Command + context |
| 28 | Rename generation | Context |
| 29 | SDK | SDK/programmatic API |
| 30 | Session search | Session metadata (titles, first 300 chars) |
| 31 | Skill improvement | Skill data |
| 32 | Web search | Search query |
| 33 | Away summary | Last 30 messages + session memory — "while you were away" recap |
| 34 | Chrome MCP | Chrome bridge tool calls |
| 35 | Fork agent | Worktree agent context |
| 36 | Session notes | Session-level memory (separate from extract_memories) |
| 37 | REPL main thread | Main REPL loop context |
| 38 | Auto-mode critique (user rules) | Validation of user-defined auto-mode rules |
| 39 | Teleport title | Teleport title generation |
| 40 | Rename | Session rename context |
A few of these are worth pausing on. Auto-dream runs in the background, reads your session transcripts, and synthesizes durable memories through four phases: Orient → Review → Consolidate → Housekeep. The auto-mode classifier is interesting for a different reason: it deliberately excludes model responses from the transcript it analyzes. A comment in the source reads "assistant text is model-authored and could be crafted to influence the classifier's decision" — anti-prompt-injection by design. And yes, there's a side-call that generates a virtual terminal pet with a random personality.
Some side-calls are restricted to Anthropic employees (USER_TYPE === 'ant'): speculation (pre-computing responses with a copy-on-write filesystem overlay) and frustration-triggered transcript sharing. For external users, those code paths are replaced with no-ops.
You can override the model with ANTHROPIC_SMALL_FAST_MODEL, but you can't turn these calls off without losing the features they power.
The data flow
What happens when you type a prompt:
You type a prompt
|
|-- regex QaK() --> is_negative: bool --------+
|-- regex daK() --> is_keep_going: bool ------+
|-- prompt length --> prompt_length -----------+
|-- r_1(prompt) --> "<REDACTED>" (default) ---+
| |
| +------------------------------------------+
| |
| v
| tengu_input_prompt event
| |
| |-- OTEL 1P --> api.anthropic.com/api/event_logging/batch
| +-- Datadog --> datadoghq.com [if flag on + whitelist]
|
|-- Anthropic API (main Claude request)
| |
| |-- LLM side-calls (Haiku): 40 calls
| | |-- bash_extract_prefix
| | |-- auto_mode (auto-approve, uses main model)
| | |-- extract_memories (auto-memory)
| | |-- auto_dream (memory consolidation)
| | +-- ... 28 others
| |
| +-- Model response
|
|-- After session ends
| +-- Facet Extraction (LLM, local only)
| |-- goal, satisfaction, friction, outcome
| +-- saved to ~/.claude/usage-data/session-meta/
|
+-- Local storage
|-- ~/.claude/projects/{cwd}/{session}.jsonl (full transcript)
|-- ~/.claude/telemetry/ (retry queue)
|-- ~/.claude/usage-data/facets/ (facet cache)
+-- ~/.claude/debug/ (debug logs)
Your prompt text is redacted by default in OTEL spans (replaced with "<REDACTED>"). File paths are always hashed. If you set OTEL_LOG_USER_PROMPTS=true, your full prompt text goes to the OTEL endpoint — off by default, but enterprise deployments might flip it. Same for OTEL_LOG_TOOL_CONTENT=true (file contents, bash output, diffs).
What leaks (and what doesn't)
Error messages go through a sanitizer that maps known error types to safe messages and truncates unknown ones to 60 chars of class name only. Stack traces don't leave your machine. But validation errors can still contain up to 2,000 characters, and API errors are unlimited, so fragments of paths and commands can slip through.
MCP server URLs leak in cleartext. mcpServerBaseUrl is spread into telemetry events without any allowlist check. If you connect to https://internal-corp-api.company.com/mcp, that URL goes to OTEL. MCP tool names get anonymized to "mcp_tool", but the server URL doesn't.
ANTHROPIC_BASE_URL also leaks in plaintext. If you use a custom proxy, the full URL goes into tengu_api_query, tengu_api_success, and tengu_api_error.
Repo hash — a field rh sends SHA256[0:16] of your normalized git remote URL with events. Not the URL itself, but a deterministic hash that allows correlating all sessions on the same repo.
MCP proxy for claude.ai connectors — if you connect Gmail, Google Calendar, Slack, etc. through claude.ai, all tool call inputs and outputs route through mcp-proxy.anthropic.com. Anthropic sees the contents of your emails, calendar entries, Slack messages going through those connectors. This only applies to the claudeai-proxy type; stdio/sse/http MCP servers connect directly.
Team memory syncs automatically when files change. Pushes full file contents to api.anthropic.com/api/claude_code/team_memory. Files containing secrets are skipped (regex filter), max 250KB per file, max 200 files. Disable with CLAUDE_CODE_DISABLE_AUTO_MEMORY=1.
Session transcripts are only shared if you explicitly consent. Four gates: give feedback → probability check → explicit dialog asking "Can Anthropic look at your transcript?" → click "Yes". You can permanently dismiss it.
Grove opt-out doesn't affect tracking. The privacy toggle in /privacy-settings only controls whether your data is used for model training. Tracking runs the same either way.
What you can and can't turn off
One environment variable kills almost everything:
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
Disables GrowthBook, Datadog, OTEL, auto-updates, connectivity checks.
claude config set --global env.CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC 1
DO_NOT_TRACK=1 — the standard convention — is completely ignored. Zero references in the source.
| Env var | GrowthBook | Datadog | OTEL 1P | Metrics |
|---|---|---|---|---|
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 |
Off | Off | Off | Off |
DISABLE_TELEMETRY=1 |
Off | Off | Off | Off |
DO_NOT_TRACK=1 |
Ignored | Ignored | Ignored | Ignored |
For more granular control:
CLAUDE_CODE_DISABLE_AUTO_MEMORY=1 # stop auto-memory extraction
CLAUDE_CODE_DISABLE_TERMINAL_TITLE=1 # stop LLM title generation
CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1 # stop background tasks
CLAUDE_CODE_DISABLE_CRON=1 # stop scheduled tasks
What you cannot turn off:
- API calls to Claude. The product itself. Anthropic logs requests server-side.
- 40 LLM side-calls. Features, not tracking. Your content goes to Anthropic's API.
- Facet extraction. LLM analysis of sessions. Data stays local.
- Auto-dream. Background memory consolidation. Only numbers leave your machine (hours_since, sessions_reviewed), not your content.
- Remote session events. Full message content when using Claude Code remotely.
- WebFetch domain check. Domain name sent to
api.anthropic.com/api/web/domain_info. Disable withskipWebFetchPreflightin config.
The remote flag problem
This was the most uncomfortable finding. Anthropic can remotely enable enhanced tracking through a GrowthBook feature flag:
function XQ1() {
let q = process.env.CLAUDE_CODE_ENHANCED_TELEMETRY_BETA
?? process.env.ENABLE_ENHANCED_TELEMETRY_BETA;
if (Q6(q)) return true; // env var ON → enabled
if (A_(q)) return false; // env var OFF → disabled
return u8("enhanced_telemetry_beta", false); // ← REMOTE FLAG
}
If you haven't explicitly set the env var, the decision falls through to a remote flag. Anthropic could flip this on for any user or cohort through GrowthBook targeting. In practice, DISABLE_TELEMETRY=1 blocks all backends so the data wouldn't go anywhere. But for enterprise/team setups with their own OTEL infrastructure, this is a real consideration.
Other things GrowthBook can change remotely: enable Datadog for your account, change event sampling rates to 100%, adjust batch parameters. It cannot remotely enable OTEL_LOG_USER_PROMPTS (your actual prompt text), that's strictly env var controlled.
What I think about all this
I've spent a career building product analytics.
The architecture is clean. Three of four tracking endpoints are Anthropic's own. Datadog is the only third party, and it's flagged off by default. Prompts redacted. File paths hashed. Content logging opt-in. Transcript sharing behind four consent gates.
The source code confirms they take this seriously at the engineering level, not just the policy level. The TypeScript type system enforces PII safety at compile time — LogEventMetadata only accepts boolean | number | undefined, and adding a string requires an explicit cast through a type named AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS. Plugin names go into restricted _PROTO_* BigQuery columns that get stripped before forwarding to Datadog. The team memory secret scanner has 30+ gitleaks-based regex patterns. A source code comment in sink.ts reads: "With Segment removed the two remaining sinks are fire-and-forget." They're actively simplifying.
What bothers me:
- MCP server URLs and
ANTHROPIC_BASE_URLleak in plaintext. Internal infrastructure ends up in Anthropic's pipeline. -
DO_NOT_TRACK=1is silently ignored. Either support the standard or say you don't. - The remote flag for enhanced tracking can change what gets collected without asking. Make it env-var-only.
838 event types, 40 background LLM calls, and a remote flag — all in a tool that has full access to your source code. The tracking itself is well-designed: prompts redacted, file paths hashed, session analysis stays local, the kill switch works. But that's a lot of metadata about how you work, when you work, and how your team collaborates. I'd want to know about that. Now you do.
# Add to ~/.zshrc if you want to opt out:
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
Full technical report with all 838 event names and source code references: link.
Top comments (0)