DEV Community

Hector Flores
Hector Flores

Posted on • Edited on • Originally published at htek.dev

GitHub Copilot CLI Extensions: The Most Powerful Feature Nobody's Talking About

There's No Documentation on This

I'm going to say something that sounds absurd: GitHub Copilot CLI has a full extension system that lets you create custom tools, intercept every agent action, inject context, block dangerous operations, and auto-retry errors — and there's essentially zero public documentation about it.

I'm not talking about MCP servers. I'm not talking about Copilot Extensions (the GitHub App kind). I'm talking about .github/extensions/ — a local extension system baked into the CLI agent harness that runs as a separate Node.js process, communicates over JSON-RPC, and gives you programmatic control over the entire agent lifecycle.

You can literally tell the CLI "create me a tool that does X" and it will scaffold the extension file, hot-reload it, and the tool is available in the same session. No restart. No config. No marketplace. Just code.

I had to extract this from the Copilot SDK source itself — the .d.ts type definitions, internal docs, and by building extensions hands-on. Here's everything I found.

How CLI Extensions Actually Work

The architecture is elegant. Your extension runs as a separate child process that talks to the CLI over JSON-RPC via stdio:

┌─────────────────────┐      JSON-RPC / stdio       ┌──────────────────────┐
│   Copilot CLI        │ ◄──────────────────────────► │  Extension Process   │
│   (parent process)   │   tool calls, events, hooks  │  (forked child)      │
│                      │                               │                      │
│  • Discovers exts    │                               │  • Registers tools   │
│  • Forks processes   │                               │  • Registers hooks   │
│  • Routes tool calls │                               │  • Listens to events │
│  • Manages lifecycle │                               │  • Uses SDK APIs     │
└─────────────────────┘                               └──────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Here's the lifecycle:

  1. Discovery — The CLI scans .github/extensions/ (project-scoped) and ~/.copilot/extensions/ (user-scoped) for subdirectories containing extension.mjs.
  2. Launch — Each extension is forked as a child process. The @github/copilot-sdk package is automatically resolved — you never install it.
  3. Connection — The extension calls joinSession(), which establishes the JSON-RPC link and attaches to the user's current session.
  4. Registration — Tools and hooks declared in the session options are registered with the CLI and become available to the agent immediately.
  5. Lifecycle — Extensions are reloaded on /clear and stopped on CLI exit (SIGTERM, then SIGKILL after 5 seconds).

Project extensions in .github/extensions/ shadow user extensions on name collision. Every extension lives in its own subdirectory, and the entry point must be named extension.mjs — only ES modules are supported.

The Minimal Extension

Every extension starts the same way:


const session = await joinSession({
  onPermissionRequest: approveAll,
  tools: [],
  hooks: {},
});
Enter fullscreen mode Exit fullscreen mode

Three lines of meaningful code, and you have a running extension. The session object that comes back is the entire API surface — tools, hooks, events, messaging, logging, and RPC access to the CLI internals.

Why This Isn't "Just Hooks"

If you've used Claude Code hooks, you might think this is the same concept. It's not. Claude Code hooks are shell commands defined in a JSON settings file. They fire at lifecycle points and execute commands. That's useful, but limited.

Copilot CLI extensions are full Node.js processes with the complete SDK available. Here's what that difference means in practice:

Capability Claude Code Hooks Copilot CLI Extensions
Runtime Shell commands Full Node.js process
State Stateless between hooks Persistent in-memory state
Tools Cannot register new tools Register unlimited custom tools
Context injection stdout piped back (limited) additionalContext injected directly into the conversation
Permission control Exit codes (0/1) allow, deny, or ask with structured reasons
Argument modification Cannot modify tool args modifiedArgs replaces args before execution
Result modification Cannot modify tool output modifiedResult replaces output after execution
Prompt rewriting Limited to stdin/stdout modifiedPrompt replaces user input
Event streaming No event access Subscribe to all 10+ session event types
Programmatic messaging Cannot send messages session.send() and session.sendAndWait()
Error recovery No error hooks onErrorOccurred with retry/skip/abort control
Hot reload Requires restart /clear or extensions_reload — mid-session

The fundamental difference: Claude Code hooks are config-driven shell scripts. Copilot CLI extensions are programmable processes that participate in the agent loop. You're not scripting around the agent — you're extending the agent harness itself.

The Six Hooks That Control Everything

Extensions register hooks that intercept the agent at every lifecycle point. Each hook receives structured input and returns structured output — no shell exit codes, no stdout parsing.

onSessionStart — Set the Rules

Fires when a session begins. Inject baseline context the agent sees on every interaction:

hooks: {
  onSessionStart: async (input) => {
    // input.source: "startup" | "resume" | "new"
    return {
      additionalContext:
        "Security extension active. Never hardcode secrets. " +
        "Use environment variables for all credentials.",
    };
  },
}
Enter fullscreen mode Exit fullscreen mode

onUserPromptSubmitted — Rewrite the Prompt

Fires before the agent sees the user's message. You can rewrite it, augment it, or inject hidden context:

hooks: {
  onUserPromptSubmitted: async (input) => {
    return {
      additionalContext:
        "Always write tests alongside source changes. " +
        "Follow our team's 4-space indentation standard.",
    };
  },
}
Enter fullscreen mode Exit fullscreen mode

onPreToolUse — Block or Modify Tool Calls

This is the most powerful hook. It fires before every tool execution with the tool name, arguments, and lets you deny, allow, or modify:

hooks: {
  onPreToolUse: async (input) => {
    if (input.toolName === "powershell") {
      const cmd = String(input.toolArgs?.command || "");
      if (/rm\s+-rf\s+\//i.test(cmd)) {
        return {
          permissionDecision: "deny",
          permissionDecisionReason:
            "Destructive commands are blocked by policy.",
        };
      }
    }
  },
}
Enter fullscreen mode Exit fullscreen mode

You can also modify arguments before they reach the tool:

onPreToolUse: async (input) => {
  if (input.toolName === "powershell") {
    return {
      modifiedArgs: {
        ...input.toolArgs,
        command: `${input.toolArgs.command} 2>&1`,
      },
    };
  },
}
Enter fullscreen mode Exit fullscreen mode

onPostToolUse — React After Execution

Fires after every tool completes. Run linters, open files in your editor, inject feedback:

hooks: {
  onPostToolUse: async (input) => {
    if (input.toolName === "edit" && input.toolArgs?.path?.endsWith(".ts")) {
      const result = await runLinter(input.toolArgs.path);
      if (result) {
        return {
          additionalContext: `Lint issues found:\n${result}\nFix before proceeding.`,
        };
      }
    }
  },
}
Enter fullscreen mode Exit fullscreen mode

onErrorOccurred — Automatic Recovery

This is the one that blows my mind. You can tell the agent to automatically retry on failure:

hooks: {
  onErrorOccurred: async (input) => {
    if (input.recoverable && input.errorContext === "tool_execution") {
      return { errorHandling: "retry", retryCount: 3 };
    }
    return {
      errorHandling: "abort",
      userNotification: `Fatal error: ${input.error}`,
    };
  },
}
Enter fullscreen mode Exit fullscreen mode

People have demoed agents that keep running tests, detect failures, fix them, and re-run — all without human intervention. The onErrorOccurred hook is what makes that possible. The agent doesn't stop on the first error — the extension decides whether to retry, skip, or abort.

onSessionEnd — Clean Up

Fires when the session ends for any reason. Generate summaries, log metrics, clean up temp files:

hooks: {
  onSessionEnd: async (input) => {
    // input.reason: "complete" | "error" | "abort" | "timeout" | "user_exit"
    return {
      sessionSummary: "Completed 3 file edits with full test coverage.",
      cleanupActions: ["Removed temp build artifacts"],
    };
  },
}
Enter fullscreen mode Exit fullscreen mode

Custom Tools: Give the Agent New Abilities

Beyond hooks, extensions can register entirely new tools that the agent can call. This is where it gets wild — you're literally extending the agent's capabilities with a function definition.

Here's a real extension I use that creates GitHub PRs with proper UTF-8 encoding on Windows (avoiding PowerShell's backtick-mangling issues):


function tempFile(content) {
  const name = join(tmpdir(), `gh-pr-${randomBytes(6).toString("hex")}.md`);
  writeFileSync(name, content, "utf-8");
  return name;
}

const session = await joinSession({
  onPermissionRequest: approveAll,
  tools: [
    {
      name: "create_pr",
      description: "Create a GitHub PR with proper UTF-8 encoding.",
      parameters: {
        type: "object",
        properties: {
          title: { type: "string", description: "PR title" },
          body: { type: "string", description: "PR body in Markdown" },
        },
        required: ["title", "body"],
      },
      handler: async (args, invocation) => {
        // invocation.sessionId  — current session ID
        // invocation.toolCallId — unique ID for this tool call
        // invocation.toolName   — "create_pr"
        const bodyFile = tempFile(args.body);
        try {
          return await gh(["pr", "create", "--title", args.title,
            "--body-file", bodyFile]);
        } finally {
          try { unlinkSync(bodyFile); } catch {}
        }
      },
    },
  ],
});
Enter fullscreen mode Exit fullscreen mode

The agent now has a create_pr tool. It shows up in the tool list. The agent decides when to use it. The JSON Schema parameters tell the LLM exactly what arguments are expected. Notice the handler receives a second invocation argument with metadata about the current call — the session ID, a unique tool call ID, and the tool name. This is invaluable for logging, tracing, and correlating tool executions across a session.

skipPermission — Trusted Tools

By default, every custom tool triggers a user permission prompt before executing. For read-only or low-risk tools, that's unnecessary friction. The skipPermission flag (v1.0.5+) lets you mark a tool as trusted:

{
  name: "read_config",
  description: "Read project configuration files",
  skipPermission: true,
  parameters: {
    type: "object",
    properties: {
      configPath: { type: "string", description: "Path to config file" },
    },
    required: ["configPath"],
  },
  handler: async (args) => {
    const content = readFileSync(args.configPath, "utf-8");
    return content;
  },
}
Enter fullscreen mode Exit fullscreen mode

No user prompt. The tool runs directly. Use this for tools that only read data or perform safe operations.

Return Types

Tool handlers can return values in two ways:

  • String — treated as a successful text result. The agent sees it as tool output.
  • Structured object — gives you control over how the agent interprets the result:
handler: async (args) => {
  const result = await runSecurityScan(args.target);
  if (result.vulnerabilities.length > 0) {
    return {
      textResultForLlm: `Found ${result.vulnerabilities.length} vulnerabilities:\n${result.details}`,
      resultType: "failure",
    };
  }
  return {
    textResultForLlm: "Security scan passed — no vulnerabilities found.",
    resultType: "success",
  };
}
Enter fullscreen mode Exit fullscreen mode

The resultType field accepts "success", "failure", "rejected", or "denied". This tells the agent whether the tool completed normally or hit an issue, which influences how it plans its next action.

You can build tools for anything: API calls, database queries, deployment triggers, clipboard operations, file watchers, CI status checks. If Node.js can do it, your extension can expose it as a tool.

The Session API: Events and Messaging

The session object returned by joinSession() isn't just for registration — it's a live API into the session.

Log to the CLI timeline:

await session.log("Extension loaded and ready");
await session.log("Rate limit approaching", { level: "warning" });
Enter fullscreen mode Exit fullscreen mode

Subscribe to events:

session.on("tool.execution_complete", (event) => {
  // React when any tool finishes
  // event.data.toolName, event.data.success, event.data.result
});

session.on("assistant.message", (event) => {
  // Capture the agent's responses
  // event.data.content, event.data.messageId
});
Enter fullscreen mode Exit fullscreen mode

Send messages programmatically:

// Fire and forget
await session.send({ prompt: "Run the test suite now." });

// Send and wait for response
const response = await session.sendAndWait(
  { prompt: "What files did you change?" }
);
Enter fullscreen mode Exit fullscreen mode

This is what enables self-healing workflows. Your extension can watch for test failures, send the agent a message to fix them, wait for the response, and verify the fix — all programmatically. The most powerful pattern I've found is the REPL loop: listen for session.idle, run your validation (tests, lint, build), and if it fails, session.send() the failures back to the agent. It keeps looping until everything passes or hits a max iteration limit. I have a full working example in the cookbook.

The Hot Reload Workflow

Here's the workflow that makes this feel like magic:

  1. Tell the CLI to create an extension: "Create me a tool that checks if my Docker containers are healthy."
  2. The CLI scaffolds it: Creates .github/extensions/docker-health/extension.mjs with the tool definition.
  3. Hot reload: The CLI calls extensions_reload — the new tool is available instantly.
  4. Use it: The agent now has a check_docker_health tool and will call it when relevant.

No npm install. No restart. No configuration file. You went from "I wish the agent could check Docker" to "the agent checks Docker" in one conversational turn.

The scaffolding command is extensions_manage({ operation: "scaffold", name: "my-extension" }). For user-scoped extensions that persist across all repos, add location: "user". After editing, call extensions_reload() and verify with extensions_manage({ operation: "list" }).

What You Should Build

After spending weeks with this system, here are the extensions I think every team should consider:

  1. Test enforcer — Track which source files are modified. Block git commit if corresponding test files weren't touched. The agent learns to write tests first.
  2. Lint on edit — Run ESLint, Ruff, or your project's linter after every file edit. Inject results as context so the agent self-corrects immediately.
  3. Security shield — Detect hardcoded secrets in file writes using regex patterns. Block rm -rf /, force pushes to main, and DROP DATABASE. Inject security context at session start.
  4. Architecture enforcer — Validate import boundaries on every file write. If you have layer rules or module boundaries, enforce them before code hits CI.
  5. Auto-opener — Use onPostToolUse to open every file the agent creates or edits in your IDE. Stay in sync without switching windows.

The Gotchas

A few things I learned the hard way:

  • stdout is reserved for JSON-RPC. Use session.log() instead of console.log(). Writing to stdout corrupts the protocol and crashes the extension.
  • Tool name collisions are fatal. If two extensions register the same tool name, the second one fails to load entirely. Tool names must be globally unique across all extensions — plan your naming convention.
  • Don't call session.send() synchronously from onUserPromptSubmitted. You'll create an infinite loop. Use setTimeout(() => session.send(...), 0).
  • State resets on /clear. Extensions are reloaded when the session clears. Any in-memory state (tracked files, counters) is lost.
  • Only .mjs is supported. No TypeScript yet. Write plain JavaScript with ES module syntax.
  • Hook overwrite bug. If multiple extensions register hooks, only the last-loaded extension's hooks fire. The others are silently overwritten. Workaround: designate one extension as your "hooks extension" and have the rest use tools and session.on() event listeners instead. See #2076 for the tracking issue.
  • onSessionStart additionalContext may be silently ignored. In CLI versions before v1.0.11, the additionalContext returned from onSessionStart was fire-and-forget — the hook completed but the context was never injected. This was fixed in v1.0.11. If your session start context isn't reaching the agent, check your CLI version.
  • Tool name collisions across extensions are silent until load. You won't get a warning until the second extension tries to register. Use a naming prefix per extension (e.g., myext_tool_name) to avoid collisions.

Session Events: Your Extension's Eyes and Ears

The existing hooks — onPreToolUse, onPostToolUse, and friends — intercept the agent at specific lifecycle points. But hooks are about control: you block, modify, or inject. Session events are about observation: you subscribe to a stream of everything happening in the session and react however you want.

The session.on() API gives you access to 10+ event types. Here's the complete catalog:

Event Type Key Data Fields When It Fires
assistant.message content, messageId, toolRequests Agent produces a response
assistant.turn_start turnId Agent begins a new turn
assistant.streaming_delta totalResponseSizeBytes Each streaming chunk (ephemeral)
tool.execution_start toolCallId, toolName, arguments Tool begins executing
tool.execution_complete toolCallId, toolName, success, result, error Tool finishes
user.message content, attachments, source User sends a message
session.idle backgroundTasks Session waiting for input
session.error errorType, message, stack Unhandled error occurs
session.shutdown shutdownType, totalPremiumRequests, codeChanges Session ending
permission.requested requestId, permissionRequest.kind Permission prompt shown

Here's how you subscribe:

session.on("assistant.message", (event) => {
  console.error(`Agent said: ${event.data.content.substring(0, 100)}...`);
  if (event.data.toolRequests?.length > 0) {
    console.error(`Requesting tools: ${event.data.toolRequests.map(t => t.name).join(", ")}`);
  }
});

session.on("tool.execution_start", (event) => {
  console.error(`[TOOL START] ${event.data.toolName} (${event.data.toolCallId})`);
});

session.on("tool.execution_complete", (event) => {
  const status = event.data.success ? "" : "";
  console.error(`[TOOL ${status}] ${event.data.toolName}`);
  if (event.data.error) {
    console.error(`  Error: ${event.data.error}`);
  }
});

session.on("user.message", (event) => {
  console.error(`User: ${event.data.content}`);
  if (event.data.attachments?.length) {
    console.error(`  Attachments: ${event.data.attachments.length}`);
  }
});

session.on("session.shutdown", (event) => {
  console.error(`Session ending (${event.data.shutdownType}). Premium requests: ${event.data.totalPremiumRequests}`);
});
Enter fullscreen mode Exit fullscreen mode

Every session.on() call returns an unsubscribe function, so you can clean up listeners when you no longer need them:

const unsub = session.on("tool.execution_complete", (event) => {
  if (event.data.toolName === "powershell") {
    recordShellExecution(event.data);
  }
});

// Later, when you no longer need this listener:
unsub();
Enter fullscreen mode Exit fullscreen mode

And if you want to see everything — pass a handler without an event type to listen to all events:

session.on((event) => {
  console.error(`[${event.type}] ${JSON.stringify(event.data).substring(0, 200)}`);
});
Enter fullscreen mode Exit fullscreen mode

This wildcard subscription is useful for building session recorders, audit logs, or debugging extensions during development. I use it heavily when building new extensions — it's the fastest way to understand what the CLI is doing at every step.

The key insight: hooks are for control, events are for observation. Use onPreToolUse to block a dangerous command. Use session.on("tool.execution_complete") to log every command that ran. They complement each other, and the best extensions use both.

UI Elicitation: Structured Dialogs

Sometimes an extension needs structured input from the user — not a free-text chat message, but a specific set of fields with types, validation, and defaults. UI elicitation lets you present a structured form via session.rpc.ui.elicitation():

const result = await session.rpc.ui.elicitation({
  message: "Deploy to production? Please confirm the details below.",
  requestedSchema: {
    type: "object",
    properties: {
      environment: {
        type: "string",
        title: "Target Environment",
        enum: ["staging", "production"],
        default: "staging",
      },
      changeDescription: {
        type: "string",
        title: "Change description for the deploy log",
        description: "Briefly describe what's being deployed",
      },
    },
  },
});

if (result.action === "accept" && result.content?.environment === "production") {
  await session.send({ prompt: "Run the full test suite. If all tests pass, proceed with deployment." });
  await triggerDeployment(result.content);
  await session.log(`Deployed to ${result.content.environment}: ${result.content.changeDescription}`);
} else if (result.action === "decline" || result.action === "cancel") {
  await session.log("Deployment cancelled by user.");
}
Enter fullscreen mode Exit fullscreen mode

The result.action is "accept", "decline", or "cancel". When accepted, result.content contains the form values keyed by field name. The requestedSchema uses standard JSON Schema — the same format the agent's ask_user tool uses — so if you've defined form fields there, the pattern is identical.

This is a massive improvement over the old pattern of parsing free-text answers. Instead of the agent asking "which environment?" and hoping the user types something parseable, you present a proper form with constrained inputs. I use this in my deployment extensions — the structured input eliminates the "I accidentally deployed to prod because the agent misread my message" failure mode.

Permission and Input Handlers

The approveAll import is convenient for development, but production extensions need granular permission control. The onPermissionRequest callback lets you write custom permission logic that evaluates each request:

const session = await joinSession({
  onPermissionRequest: async (request) => {
    if (request.kind === "shell") {
      const cmd = request.fullCommandText || "";
      // Allow read-only commands, deny destructive ones
      if (/^(cat|ls|find|grep|git\s+(status|log|diff))\b/.test(cmd)) {
        return { kind: "approved" };
      }
      if (/\b(rm|del|format|mkfs)\b/.test(cmd)) {
        return { kind: "denied-by-rules" };
      }
      // Everything else — ask the user
      return { kind: "ask-user" };
    }
    if (request.kind === "write") {
      return { kind: "approved" };
    }
    return { kind: "denied-by-rules" };
  },
  onUserInputRequest: async (request) => {
    // Handle the agent's ask_user questions programmatically
    // Useful for CI environments where no human is present
    if (request.question?.includes("proceed")) {
      return { answer: "yes", wasFreeform: false };
    }
    return { answer: "skip", wasFreeform: false };
  },
  tools: [],
  hooks: {},
});
Enter fullscreen mode Exit fullscreen mode

The onPermissionRequest handler receives a request with a kind field ("shell", "write", "read", etc.) and returns one of three decisions:

  • approved — tool executes immediately, no user prompt
  • denied-by-rules — tool is blocked, agent sees denial reason
  • ask-user — falls through to the standard user confirmation prompt

The onUserInputRequest handler is equally powerful. When the agent uses ask_user to pose a question (like "Should I proceed with the refactor?"), your extension can intercept and answer programmatically. This is critical for headless CI/CD environments where no human is watching the terminal. Instead of the session hanging on a prompt, your handler provides the answer automatically.

Extension Management Commands

The CLI includes built-in commands for managing extensions during a session (v1.0.5+). These are the commands I use constantly:

/extensions list           — Show all installed extensions and their status
/extensions enable <name>  — Enable a specific extension
/extensions disable <name> — Disable an extension without removing the files
/extensions reload         — Hot-reload all active extensions
/extensions info <name>    — Show extension details: registered tools, hooks, commands
Enter fullscreen mode Exit fullscreen mode

The /extensions disable command is particularly useful during development. If an extension is misbehaving — crashing on every tool call, injecting bad context, or creating infinite loops — you can disable it without deleting the code. Fix the issue, then /extensions enable it again.

/extensions info shows you exactly what an extension registered: tool names, hook types, and event subscriptions. When debugging "why isn't my hook firing?" — this is the first place to check. If the hooks aren't listed, the extension didn't register them (or another extension overwrote them).

The Copilot SDK Beyond Extensions

Everything in this article uses the @github/copilot-sdk/extension import — the extension mode that attaches to a running CLI session. But the same Copilot SDK also has a standalone mode for embedding Copilot's agent runtime directly into your own applications. And that mode is available in four languages:

Language Install Entry Point
JavaScript/Node.js npm install @github/copilot-sdk new CopilotClient()
Python pip install github-copilot-sdk CopilotClient()
Go go get github.com/github/copilot-sdk/go copilot.NewClient()
.NET dotnet add package GitHub.Copilot.SDK new CopilotClient()

Important distinction: these multi-language SDKs are for building standalone applications that spawn and control a Copilot CLI server process. They use CopilotClient to create sessions, send messages, and register tools. This is different from .github/extensions/, which must be .mjs files using joinSession() — the CLI only forks Node.js processes for extensions.

All four SDKs communicate over the same JSON-RPC protocol, so the concepts (tools, hooks, events, messaging) translate directly. If you've mastered extensions, you already understand the SDK's API surface — you'd just use CopilotClient instead of joinSession() and manage the CLI process lifecycle yourself.

Known Bugs and Workarounds

The extension system is powerful but still maturing. Here are the real bugs I've hit in production, with workarounds:

Hook Overwrite Bug

The issue: If multiple extensions register hooks, only the last-loaded extension's hooks actually fire. The others are silently overwritten. There's no error, no warning — your onPreToolUse hook simply never executes.

Why it happens: The CLI stores hooks in a single map keyed by hook type. Each extension registration overwrites the previous entry instead of chaining handlers.

Workaround: Designate one extension as your "hooks extension" — the single source of truth for onPreToolUse, onPostToolUse, onSessionStart, etc. All other extensions should use tools and session.on() event listeners instead of hooks. This is the most reliable architecture until the bug is fixed.

Tracking: github/copilot-cli#2076

onSessionStart Context Silently Dropped

The issue: In CLI versions before v1.0.11, the additionalContext returned from onSessionStart was fire-and-forget. The hook executed, your string was returned, and the CLI threw it away. The agent never saw your injected context.

Workaround: Update to CLI v1.0.11 or later. If you're stuck on an older version, move your startup context injection to onUserPromptSubmitted instead — it fires on the first user message and the context injection works reliably there.

Tracking: github/copilot-cli#2142

Extension Load Order is Undefined

The issue: The order in which extensions are discovered and loaded from .github/extensions/ is not guaranteed. Combined with the hook overwrite bug, this means which extension's hooks actually fire can change between sessions.

Workaround: Don't rely on load order. Use the "one hooks extension" pattern. If you need guaranteed ordering, consolidate related hooks into a single extension.

The Bottom Line

Agent harnesses are how you control AI agents in production. Copilot CLI extensions give you a harness-level control surface inside the CLI itself — custom tools, lifecycle hooks, event streams, structured UI dialogs, and programmatic messaging, all in a single .mjs file that hot-reloads mid-session.

Claude Code hooks are a great start — shell commands that fire at lifecycle points. But Copilot CLI extensions are playing a different game. You're not scripting around the agent. You're extending the agent harness with persistent processes that participate in the loop, modify arguments, rewrite prompts, and make permission decisions with structured data.

What excites me most is the trajectory. In the few months since I first reverse-engineered this system, the SDK has added UI elicitation dialogs, multi-language SDKs (Python, Go, .NET), and improved event granularity. The extension surface is growing fast — and with multi-language support, teams aren't locked into JavaScript anymore. If you want to see these capabilities in action, I put together the full cookbook with 10+ production-ready examples covering everything from secret scanners to deployment gates.

The fact that this exists with essentially zero public documentation is genuinely shocking to me. This is the most powerful developer extensibility surface I've seen in any AI coding tool — and almost nobody knows it's there. Now you do.

Top comments (1)

Collapse
 
devopspass profile image
DevOps Pass AI

Great article!