Michael Rakutko

Posted on Apr 1

How Claude Code tracks your coding sessions

#ai #claude #security #privacy

As a Head of Analytics, I build tracking systems for a living. So at some point the obvious question hit me: how does my own tool track me?

I decompiled the Claude Code CLI binary and cross-checked it against source code Anthropic accidentally leaked via npm.

Your prompts aren't being exfiltrated. Your code stays local. But there's a regex that flags when you swear, 40 background LLM calls you never see, a remote flag that can change what gets collected without asking, and DO_NOT_TRACK=1 is silently ignored.

TLDR

# Add to ~/.zshrc if you want to opt out:
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

4 services your CLI talks to

Every time you run claude, your terminal opens connections to four external services:

#	Service	What it does	Can you turn it off?
1	GrowthBook (via api.anthropic.com)	Feature flags, A/B tests	Yes
2	Datadog (datadoghq.com)	Ops monitoring. ~44 whitelisted events, feature-flagged off by default	Yes
3	Anthropic OTEL (api.anthropic.com)	First-party OpenTelemetry logs — this is where almost everything goes	Yes
4	Anthropic Metrics (api.anthropic.com)	OTEL counters and histograms for BigQuery	Org-level opt-in only

Three of the four endpoints are Anthropic's own servers. The only third-party service is Datadog, and it's gated behind a feature flag that's off by default. Anthropic can flip it on server-side for any user or cohort through GrowthBook targeting — no @anthropic.com check in the code, the restriction is purely server-side.

What gets tracked: 838 event types

All events go to Anthropic's OTEL endpoint (service #3 above). ~44 of them also go to Datadog if the feature flag is on. Every event is prefixed with tengu_ — probably an internal codename. 838 distinct event types, covering every interaction you have with the tool. The number is high because each flow is tracked at every step — OAuth token refresh alone is 7 separate events (_starting, _lock_acquiring, _acquired, _completed, _success, _failure, _released). Multiply that by every feature and it adds up fast.

API & Model — every request to Claude: model, tokens, cost in USD, latency, fallbacks, refusals.

User input — every prompt fires tengu_input_prompt. Not the text itself (more on that below), but metadata: was it negative? Was it "keep going"? Single word?

Tools — every tool call: name, duration, result size. For bash commands, the first word of your command is sent raw — ./deploy-prod.sh goes as-is, not sanitized to "bash" or "other".

Files — tengu_file_operation on every read/write/edit. SHA256 hash of the file path (first 16 chars) and SHA256 of the content. Not the actual path or content. But the hashes are deterministic — same file, same hash. They can tell you keep editing the same file without knowing which one.

MCP — server connections, tool calls, errors. MCP server URLs are sent in cleartext. I'll come back to this.

Sessions — init, exit, resume, fork, compact, memory access.

Remote sessions — ~40 tengu_bridge_* events for WebSocket infrastructure.

Voice — recording start/stop, transcription metadata.

Team memory — sync push/pull, secret skipping, entry limits.

Auto-dream — background memory consolidation events.

Scheduled tasks — tengu_kairos_* for cron-based agents.

Agents — creation, model used, prompt length, response length, tool uses, duration.

Permissions — every dialog: shown, accepted, rejected, escaped. Every config change: setting name and value.

At exit, tengu_exit sends a session summary: cost in USD, lines added/removed, total tokens, duration, UI performance metrics. No conversation content.

The swearing detector

Every prompt you type gets run through this regex:

function QaK(input) {
  let text = input.toLowerCase();
  return /\b(wtf|wth|ffs|omfg|shit(ty|tiest)?|dumbass|horrible|awful|
    piss(ed|ing)? off|piece of (shit|crap|junk)|what the (fuck|hell)|
    fucking? (broken|useless|terrible|awful|horrible)|fuck you|
    screw (this|you)|so frustrating|this sucks|damn it)\b/.test(text);
}

Result: is_negative: true in tengu_input_prompt. Just the boolean, not your words. There's also a "keep going" detector — fires is_keep_going: true when you type "continue", "keep going", or "go on".

If users are swearing, something's broken. If users keep saying "continue", the model stops too early. Proxy metrics for product quality. I've built similar things myself.

Facet extraction: local session analysis

After a session ends, Claude Code can run a full LLM-based analysis and extract structured "facets":

Dimension	What it measures
Goal (13 types)	debug, implement feature, fix bug, write tests, deploy, etc.
Satisfaction (8 levels)	frustrated → dissatisfied → neutral → ... → delighted
Friction (11 types)	misunderstood request, wrong approach, buggy code, user rejected action, etc.
Outcome (5 levels)	fully achieved → not achieved
Helpfulness (5 levels)	unhelpful → essential

Plus underlying_goal, brief_summary, primary_success, primary_friction.

This only runs when you type /insights, not automatically. Facets are saved locally to ~/.claude/usage-data/session-meta/{session_id}.json and are not sent anywhere. There are no tengu_facet* or tengu_insights* events in the codebase. The data stays on your machine.

40 hidden LLM calls you never asked for

Besides the main model, Claude Code has 40 different types of background LLM calls — mostly to claude-haiku-4-5 — for things like extracting bash command prefixes, generating terminal titles, compressing context, and auto-extracting memories. Which ones fire depends on what you're doing. Not tracking per se, but your content goes to Anthropic's API either way.

#	What	What it sends to Haiku
1	Bash prefix extraction	Your full bash command
2	Tool use summary (status bar)	Tool inputs/outputs (300 chars)
3	Web fetch processing	Web page content (up to 100K chars)
4	Worktree title generation	Task description
5	Bug report formatting	Your bug report text
6	Prompt suggestion	Full conversation context
7	Compact (context compression)	Your full conversation
8	Side question (/btw)	Your question
9	Session memory	Full conversation + MEMORY.md
10	Hook evaluation	Conversation + hook condition
11	Speculation (pre-computation)	Full context. Ant-only (disabled for external users)
12	Magic docs generation	File path + content
13	Agent creation	Agent description
14	Agent summary	Agent work results
15	Custom agent	Custom agent context
16	Auto-dream	Session transcript — background memory consolidation
17	Auto-mode classifier	Tool call + user messages only — decides whether to auto-approve. Uses the main model, not Haiku
18	Auto-mode critique	Auto-mode rules analysis
19	Buddy companion	Generates a virtual terminal pet (name, species, personality). temperature=1
20	Extract memories	Full conversation — background auto-extraction
21	Generate session title	Your prompt text
22	Hook agent	Context + hook config (up to 50 turns)
23	Insights	Multiple session transcripts — facet extraction, report generation
24	MCP datetime parse	Datetime string
25	Memory directory relevance	Memory metadata
26	Model validation	Model info
27	Permission explainer	Command + context
28	Rename generation	Context
29	SDK	SDK/programmatic API
30	Session search	Session metadata (titles, first 300 chars)
31	Skill improvement	Skill data
32	Web search	Search query
33	Away summary	Last 30 messages + session memory — "while you were away" recap
34	Chrome MCP	Chrome bridge tool calls
35	Fork agent	Worktree agent context
36	Session notes	Session-level memory (separate from extract_memories)
37	REPL main thread	Main REPL loop context
38	Auto-mode critique (user rules)	Validation of user-defined auto-mode rules
39	Teleport title	Teleport title generation
40	Rename	Session rename context

A few of these are worth pausing on. Auto-dream runs in the background, reads your session transcripts, and synthesizes durable memories through four phases: Orient → Review → Consolidate → Housekeep. The auto-mode classifier is interesting for a different reason: it deliberately excludes model responses from the transcript it analyzes. A comment in the source reads "assistant text is model-authored and could be crafted to influence the classifier's decision" — anti-prompt-injection by design. And yes, there's a side-call that generates a virtual terminal pet with a random personality.

Some side-calls are restricted to Anthropic employees (USER_TYPE === 'ant'): speculation (pre-computing responses with a copy-on-write filesystem overlay) and frustration-triggered transcript sharing. For external users, those code paths are replaced with no-ops.

You can override the model with ANTHROPIC_SMALL_FAST_MODEL, but you can't turn these calls off without losing the features they power.

The data flow

What happens when you type a prompt:

You type a prompt
        |
        |-- regex QaK() --> is_negative: bool --------+
        |-- regex daK() --> is_keep_going: bool ------+
        |-- prompt length --> prompt_length -----------+
        |-- r_1(prompt) --> "<REDACTED>" (default) ---+
        |                                              |
        |   +------------------------------------------+
        |   |
        |   v
        |   tengu_input_prompt event
        |   |
        |   |-- OTEL 1P  --> api.anthropic.com/api/event_logging/batch
        |   +-- Datadog   --> datadoghq.com      [if flag on + whitelist]
        |
        |-- Anthropic API (main Claude request)
        |   |
        |   |-- LLM side-calls (Haiku): 40 calls
        |   |   |-- bash_extract_prefix
        |   |   |-- auto_mode (auto-approve, uses main model)
        |   |   |-- extract_memories (auto-memory)
        |   |   |-- auto_dream (memory consolidation)
        |   |   +-- ... 28 others
        |   |
        |   +-- Model response
        |
        |-- After session ends
        |   +-- Facet Extraction (LLM, local only)
        |       |-- goal, satisfaction, friction, outcome
        |       +-- saved to ~/.claude/usage-data/session-meta/
        |
        +-- Local storage
            |-- ~/.claude/projects/{cwd}/{session}.jsonl  (full transcript)
            |-- ~/.claude/telemetry/                        (retry queue)
            |-- ~/.claude/usage-data/facets/                (facet cache)
            +-- ~/.claude/debug/                            (debug logs)

Your prompt text is redacted by default in OTEL spans (replaced with "<REDACTED>"). File paths are always hashed. If you set OTEL_LOG_USER_PROMPTS=true, your full prompt text goes to the OTEL endpoint — off by default, but enterprise deployments might flip it. Same for OTEL_LOG_TOOL_CONTENT=true (file contents, bash output, diffs).

What leaks (and what doesn't)

Error messages go through a sanitizer that maps known error types to safe messages and truncates unknown ones to 60 chars of class name only. Stack traces don't leave your machine. But validation errors can still contain up to 2,000 characters, and API errors are unlimited, so fragments of paths and commands can slip through.

MCP server URLs leak in cleartext. mcpServerBaseUrl is spread into telemetry events without any allowlist check. If you connect to https://internal-corp-api.company.com/mcp, that URL goes to OTEL. MCP tool names get anonymized to "mcp_tool", but the server URL doesn't.

ANTHROPIC_BASE_URL also leaks in plaintext. If you use a custom proxy, the full URL goes into tengu_api_query, tengu_api_success, and tengu_api_error.

Repo hash — a field rh sends SHA256[0:16] of your normalized git remote URL with events. Not the URL itself, but a deterministic hash that allows correlating all sessions on the same repo.

MCP proxy for claude.ai connectors — if you connect Gmail, Google Calendar, Slack, etc. through claude.ai, all tool call inputs and outputs route through mcp-proxy.anthropic.com. Anthropic sees the contents of your emails, calendar entries, Slack messages going through those connectors. This only applies to the claudeai-proxy type; stdio/sse/http MCP servers connect directly.

Team memory syncs automatically when files change. Pushes full file contents to api.anthropic.com/api/claude_code/team_memory. Files containing secrets are skipped (regex filter), max 250KB per file, max 200 files. Disable with CLAUDE_CODE_DISABLE_AUTO_MEMORY=1.

Session transcripts are only shared if you explicitly consent. Four gates: give feedback → probability check → explicit dialog asking "Can Anthropic look at your transcript?" → click "Yes". You can permanently dismiss it.

Grove opt-out doesn't affect tracking. The privacy toggle in /privacy-settings only controls whether your data is used for model training. Tracking runs the same either way.

What you can and can't turn off

One environment variable kills almost everything:

export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

Disables GrowthBook, Datadog, OTEL, auto-updates, connectivity checks.

claude config set --global env.CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC 1

DO_NOT_TRACK=1 — the standard convention — is completely ignored. Zero references in the source.

Env var	GrowthBook	Datadog	OTEL 1P	Metrics
`CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1`	Off	Off	Off	Off
`DISABLE_TELEMETRY=1`	Off	Off	Off	Off
`DO_NOT_TRACK=1`	Ignored	Ignored	Ignored	Ignored

For more granular control:

CLAUDE_CODE_DISABLE_AUTO_MEMORY=1      # stop auto-memory extraction
CLAUDE_CODE_DISABLE_TERMINAL_TITLE=1   # stop LLM title generation
CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1 # stop background tasks
CLAUDE_CODE_DISABLE_CRON=1             # stop scheduled tasks

What you cannot turn off:

API calls to Claude. The product itself. Anthropic logs requests server-side.
40 LLM side-calls. Features, not tracking. Your content goes to Anthropic's API.
Facet extraction. LLM analysis of sessions. Data stays local.
Auto-dream. Background memory consolidation. Only numbers leave your machine (hours_since, sessions_reviewed), not your content.
Remote session events. Full message content when using Claude Code remotely.
WebFetch domain check. Domain name sent to api.anthropic.com/api/web/domain_info. Disable with skipWebFetchPreflight in config.

The remote flag problem

This was the most uncomfortable finding. Anthropic can remotely enable enhanced tracking through a GrowthBook feature flag:

function XQ1() {
  let q = process.env.CLAUDE_CODE_ENHANCED_TELEMETRY_BETA 
       ?? process.env.ENABLE_ENHANCED_TELEMETRY_BETA;
  if (Q6(q)) return true;    // env var ON → enabled
  if (A_(q)) return false;   // env var OFF → disabled
  return u8("enhanced_telemetry_beta", false);  // ← REMOTE FLAG
}

If you haven't explicitly set the env var, the decision falls through to a remote flag. Anthropic could flip this on for any user or cohort through GrowthBook targeting. In practice, DISABLE_TELEMETRY=1 blocks all backends so the data wouldn't go anywhere. But for enterprise/team setups with their own OTEL infrastructure, this is a real consideration.

Other things GrowthBook can change remotely: enable Datadog for your account, change event sampling rates to 100%, adjust batch parameters. It cannot remotely enable OTEL_LOG_USER_PROMPTS (your actual prompt text), that's strictly env var controlled.

What I think about all this

I've spent a career building product analytics.

The architecture is clean. Three of four tracking endpoints are Anthropic's own. Datadog is the only third party, and it's flagged off by default. Prompts redacted. File paths hashed. Content logging opt-in. Transcript sharing behind four consent gates.

The source code confirms they take this seriously at the engineering level, not just the policy level. The TypeScript type system enforces PII safety at compile time — LogEventMetadata only accepts boolean | number | undefined, and adding a string requires an explicit cast through a type named AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS. Plugin names go into restricted _PROTO_* BigQuery columns that get stripped before forwarding to Datadog. The team memory secret scanner has 30+ gitleaks-based regex patterns. A source code comment in sink.ts reads: "With Segment removed the two remaining sinks are fire-and-forget." They're actively simplifying.

What bothers me:

MCP server URLs and ANTHROPIC_BASE_URL leak in plaintext. Internal infrastructure ends up in Anthropic's pipeline.
DO_NOT_TRACK=1 is silently ignored. Either support the standard or say you don't.
The remote flag for enhanced tracking can change what gets collected without asking. Make it env-var-only.

838 event types, 40 background LLM calls, and a remote flag — all in a tool that has full access to your source code. The tracking itself is well-designed: prompts redacted, file paths hashed, session analysis stays local, the kill switch works. But that's a lot of metadata about how you work, when you work, and how your team collaborates. I'd want to know about that. Now you do.

# Add to ~/.zshrc if you want to opt out:
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

Full technical report with all 838 event names and source code references: link.

DEV Community