Usually, we assume that malloc is fast—and in most cases it is.
However, sometimes "reasonable" code can lead to very unreasonable performance.
In a previous post, I looked at using stack-based allocation (VLA / fixed-size) for temporary data, and another on estimating available stack space to use it safely.
This time I wanted to measure the actual impact in a realistic workload.
Full Article (Medium - no paywall):
Stack vs malloc: real-world benchmark shows 2–6x difference
I built a benchmark based on a loan portfolio PV calculation, where each loan creates several temporary arrays (thousands of elements each). This is fairly typical code-clean, modular, nothing unusual.
I compared:
- stack allocation (VLA)
- heap per-loan (malloc/free)
- heap reuse
- static (baseline)
Results:
- stack allocation stays very close to optimal
- heap per-loan can be ~2.5x slower (glibc) and up to ~6x slower (musl)
- even optimized allocators show pattern-dependent behavior
The main takeaway for me: allocation cost is usually hidden—but once it's in the hot path, it really matters.
Curious how others approach temporary workspace in performance-sensitive code.

Top comments (0)