Last month I was debugging a startup regression at work. Our Node.js service went from ~300ms boot to nearly 900ms overnight. No new features. No infra changes. Just a routine dependency bump.
The usual approach? Comment out requires one by one. Bisect package.json. Stare at --cpu-prof output and pretend to understand V8 internals.
I wanted something simpler: run one command, see which module is eating my startup time, and know if the cost is in the module itself or in everything it drags in.
So I built coldstart — a zero-dependency startup profiler for Node.js that instruments Module._load, reconstructs the dependency tree, and shows you exactly where boot time goes.
Full transparency: I used Claude pretty heavily while building this — for scaffolding the ESM loader hooks, generating the flamegraph HTML template, and iterating on the tree rendering logic. The core idea (patching Module._load with performance.now() bookends) and the architecture were mine, but AI absolutely accelerated the implementation. I think that's just how a lot of solo open source gets built now, and I'd rather be upfront about it.
The problem in 30 seconds
Node.js doesn't tell you why startup is slow. You get one number — total boot time — and zero breakdown.
Meanwhile:
- A single
require('sequelize')can silently add 400ms - Transitive dependencies pile up — you
requireone thing, Node loads 300 modules - Synchronous work in module scope (reading files, compiling templates, connecting to DBs) blocks the event loop before your app even starts
- Cached modules still add edges to the dependency graph, obscuring the real bottlenecks
This matters more than ever. If you're running on Lambda (where cold starts are now billed), on serverless platforms, or in containers that scale from zero — startup time is latency your users feel on the first request.
What coldstart actually does
Run it against any Node app:
npx @yetanotheraryan/coldstart server.js
You get this:
coldstart — 847ms total startup
┌─ express 234ms ████████████░░░░░░░░
│ ├─ body-parser 89ms █████░░░░░░░░░░░░░░░
│ ├─ qs 12ms █░░░░░░░░░░░░░░░░░░░
│ └─ path-to-regex 8ms ░░░░░░░░░░░░░░░░░░░░
├─ sequelize 401ms █████████████████████ ⚠ slow
│ ├─ pg 203ms ███████████░░░░░░░░░
│ └─ lodash 98ms █████░░░░░░░░░░░░░░░
└─ dotenv 4ms ░░░░░░░░░░░░░░░░░░░░
event loop max 42ms, p99 17ms, mean 4.3ms
modules 312 total, 59 cached
time split 286ms first-party, 503ms node_modules
The tree shows parent → child load relationships with inclusive timing (how long the whole subtree took) and bar charts colored by severity. At a glance you can see: sequelize is the problem, and within sequelize, it's pg and lodash doing the heavy lifting.
How it works under the hood
The core technique is straightforward — coldstart monkey-patches Module._load (the internal function Node calls for every require()):
- Before the original
_loadruns, recordperformance.now()and the parent module - Let Node do its thing — resolve, compile, execute
- After
_loadreturns, record the end time - Store the raw event:
{ request, resolvedPath, parentPath, startMs, endMs, cached }
For ESM, it uses Node's module.register() loader hooks (available in Node 18.19+) to capture resolve and load events, bridging timing data back to the main tracer through a message channel.
After your app finishes starting up, the tracer takes all those raw events and builds:
- A tree — the actual parent → child dependency graph as loaded at runtime
- Inclusive time — total wall-clock time for a module and everything it pulled in
- Exclusive time — just the module's own initialization cost, minus children
-
Event loop stats — max, mean, p99 blocking during startup using
perf_hooks -
A split — how much time was first-party code vs
node_modules
The distinction between inclusive and exclusive is key. A module with high inclusive but low exclusive time is just a gateway — it pulls in heavy children but isn't slow itself. High exclusive time means that specific module is doing expensive work at load time.
Three ways to use it
CLI (easiest — profiles any app):
coldstart server.js
coldstart --json server.js # machine-readable output
coldstart -- node --inspect app.js # pass node flags through
Programmatic API (embed in your own tooling):
import { monitor, renderTextReport } from '@yetanotheraryan/coldstart'
const done = monitor()
require('./bootstrap')
require('./server')
console.log(renderTextReport(done()))
Preload mode (zero code changes):
node --require @yetanotheraryan/coldstart/register server.js
# or for ESM:
node --import @yetanotheraryan/coldstart/register server.mjs
There's also a renderFlamegraphHtml() export that generates a self-contained HTML flamegraph you can open in a browser — useful for sharing with your team or dropping into a PR description.
What I actually found at work
After running coldstart on our service, the culprit was obvious in under a second: a transitive dependency three levels deep was doing synchronous file I/O at module scope to read a config file. The dependency bump had changed its initialization path.
The fix was a one-line lazy require() that moved the load out of the critical startup path. Boot time went back to ~320ms.
Without the tree view, I'd have been bisecting for an hour.
Why not just use --cpu-prof?
--cpu-prof is great for understanding what code is running, but it doesn't answer which module load is slow or what's the dependency chain that got us here. You get a flamegraph of V8 internals and function calls, not a map of your require() tree with timing.
coldstart is deliberately higher-level. It answers "which npm package is making my startup slow?" — not "which V8 builtin is hot."
They're complementary. Use coldstart to find the slow module, then --cpu-prof if you need to understand why that module is slow.
Current status & what's missing
Working today:
- CommonJS profiling
- ESM profiling (Node 18.19+)
- CLI, programmatic API, preload mode
- Text report, JSON report, HTML flamegraph
Not yet implemented:
- Dynamic
import()tracing - Watch mode for iterating on startup optimizations
- CI integration (fail if startup exceeds a threshold)
It's early. The API is stable enough for everyday use but I'm iterating on the output format and considering a few features based on what people actually need.
Try it
npm install @yetanotheraryan/coldstart
Or just run it once with npx:
npx @yetanotheraryan/coldstart your-app.js
GitHub: github.com/yetanotheraryan/coldstart
If this is useful to you, a star on the repo genuinely helps with discoverability. And if you run it on your app and find something interesting — I'd love to hear about it in the comments. What was your slowest module?
I'm Aryan — I build open source tools for Node.js on the side. You can find my other projects on GitHub.
Top comments (0)