Skip to content

Commit bbf7d27

Browse files
committed
blog: clarify in async hook DoS post and add CWE pointers
It seems there are still some confusions from how this weakness works, especially since APM tools are only part of the reproduction but are not vulnerable per-se. This patch tries to clarify a bit and add some pointers to the CWEs that apply.
1 parent 7a949e8 commit bbf7d27

File tree

1 file changed

+18
-17
lines changed

1 file changed

+18
-17
lines changed

apps/site/pages/en/blog/vulnerability/january-2026-dos-mitigation-async-hooks.md

Lines changed: 18 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -9,22 +9,20 @@ author: Matteo Collina and Joyee Cheung
99

1010
## TL;DR
1111

12-
Node.js/V8 makes a best-effort attempt to recover from stack space exhaustion with a catchable error, which frameworks have come to rely on for service availability. A bug that only reproduces when `async_hooks` are used would break this attempt, causing Node.js to exit with 7 directly without throwing a catchable error when recursions in user code exhaust the stack space. This makes applications whose recursion depth is controlled by unsanitized input vulnerable to Denial-of-Service attacks. This silently affects countless applications because:
12+
Node.js/V8 makes a best-effort attempt to recover from stack space exhaustion with a catchable error, which frameworks have come to rely on for service availability. An edge case that reproduces only when `async_hooks` are enabled breaks this recovery path: when recursion in user code exhausts stack space, Node.js exits immediately with exit code 7 instead of throwing a recoverable error. This can be reproduced in countless applications because:
1313

1414
- **React Server Components** use `AsyncLocalStorage`
1515
- **Next.js** uses `AsyncLocalStorage` for request context tracking
16-
- **Other frameworks** using `AsyncLocalStorage` for request context may also be affected
17-
- **Every APM tool** (Datadog, New Relic, Dynatrace, Elastic APM, OpenTelemetry) uses `AsyncLocalStorage` or `async_hooks.createHook` to trace requests
16+
- **Other frameworks** may also use `AsyncLocalStorage` for request context may also be affected
17+
- **Most APM tools** (Datadog, New Relic, Dynatrace, Elastic APM, OpenTelemetry) use `AsyncLocalStorage` or `async_hooks.createHook` to trace requests
1818

19-
Due to the prevalence of this usage pattern in frameworks, including but not limited to React/Next.js, a significant part of the ecosystem is expected to be affected.
20-
21-
The bug fix is included in a security release because of its widespread impact on the ecosystem. However, this is only a mitigation for the general risk that lies in the ecosystem's dependence on recoverable stack space exhaustion for service availability.
19+
The weakness ultimately lies in the ecosystem's reliance on an unspecified behavior in the language - recovery from stack space exhaustion - for service availability ([CWE-758](https://cwe.mitre.org/data/definitions/758.html)). Given the widespread use of `async_hooks` by popular frameworks and APM tools, the aforementioned edge case can expose this weakness more frequently and can present a Denial‑of‑Service vector for many applications. Node.js shipped a mitigation in the January 2026 security release to make this unspecified behavior more consistent, reducing the chance of reproduction. However, the weakness remains in the ecosystem until applications and frameworks move away from relying on unspecified behavior for availability.
2220

2321
**For users of these frameworks/tools and server hosting providers**: Update as soon as possible.
2422

25-
**For libraries and frameworks**: apply a more robust defense against stack space exhaustion to ensure service availability (e.g., limit recursion depth or avoid recursions if the depth can be controlled by an attacker). A recoverable `RangeError: Maximum call stack size exceeded` is only an unspecified behavior implemented on a best-effort basis, and cannot be depended on for security guarantees.
23+
**For libraries and frameworks**: apply more robust defenses against stack space exhaustion to ensure service availability (e.g., limit recursion depth or avoid recursion if the depth can be controlled by an attacker). A recoverable `RangeError: Maximum call stack size exceeded` is only an unspecified behavior maintained with best-effort, and cannot be depended on for security guarantees.
2624

27-
## The Bug
25+
## The Reproduction
2826

2927
When a stack overflow occurs in user code while `async_hooks` is enabled, Node.js **immediately exits with code `7`** instead of allowing `try-catch` blocks to catch the error. This is a special condition in Node.js that skips the `process.on('uncaughtException')` handlers, making the exception uncatchable.
3028

@@ -169,10 +167,10 @@ A user sending deeply nested JSON can crash your entire server:
169167
]
170168
```
171169

172-
**Without `async_hooks`**: `try-catch` catches the `RangeError`, returns 500, server continues
173-
**With `async_hooks` (React/Next.js)**: Server crashes immediately with exit code 7
170+
- **Without `async_hooks`**: `try-catch` catches the `RangeError`, returns 500, server continues
171+
- **With `async_hooks` (React/Next.js)**: Server crashes immediately with exit code 7
174172

175-
## Why This Affects Every APM User
173+
## Why the Use of APM Tools Makes It Easier to Reproduce
176174

177175
Application Performance Monitoring (APM) tools are essential infrastructure for production applications. They track request latency, identify bottlenecks, trace errors to their source, and alert teams when something goes wrong. Companies use APM tools like Datadog, New Relic, Dynatrace, Elastic APM, and OpenTelemetry to maintain visibility into their distributed systems.
178176

@@ -186,15 +184,18 @@ The irony is notable: the tools you install to monitor and debug crashes can mak
186184

187185
While this issue has significant practical impact, we want to be clear about why Node.js is treating this fix as a mere mitigation of security vulnerability risks at large:
188186

189-
### Stack Space Exhaustion Is Not Specified Behavior
187+
### Stack Space Exhaustion Behavior Is Not Specified
188+
189+
The "Maximum call stack size exceeded" error is not part of the ECMAScript specification. [The specification does not impose any limit and assumes infinite stack space](https://tc39.es/ecma262/#execution-context-stack). Imposing a limit and throwing a recoverable error are only behaviors that JavaScript engines implement with best-effort. Applications and frameworks are already at risk when they build a security model on top of these unspecified behaviors that are not guaranteed to reproduce consistently, see:
190190

191-
The "Maximum call stack size exceeded" error is not part of the ECMAScript specification. [The specification does not impose any limit, assuming infinite stack space](https://tc39.es/ecma262/#execution-context-stack); imposing a limit and throwing an error is simply behavior that JavaScript engines implement on a best-effort basis. Building a security model on top of an undocumented, unspecified feature that isn't guaranteed to work consistently would be unreliable.
191+
- [CWE-758: Reliance on Undefined, Unspecified, or Implementation-Defined Behavior](https://cwe.mitre.org/data/definitions/758.html)
192+
- [CWE-674: Uncontrolled Recursion](https://cwe.mitre.org/data/definitions/674.html)
192193

193194
It's worth noting that even when ECMAScript specifies that [proper tail calls](https://tc39.es/ecma262/#sec-tail-position-calls) [should reuse stack frames](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Execution_model#tail_calls), this is not implemented by most JavaScript engines today, including V8. And in the few JavaScript engines that do implement it, proper tail calls can block an application with infinite recursion instead of hitting the stack size limit at some point and stopping with an error, which is also a Denial-of-Service factor. This reinforces that stack overflow behavior cannot be relied upon for defending against Denial-of-Service attacks.
194195

195-
### V8 Doesn't Treat This as a Security Issue
196+
### This Behavior Is Not Part of The Security Guarantees of V8
196197

197-
The stack space handling in Node.js is primarily implemented by V8. JavaScript engines developed for browsers have a different security model, and they do not treat crashes like this as security vulnerabilities ([example](https://issues.chromium.org/issues/432385241)). This means similar bugs reported in the upstream will not go through vulnerability disclosure procedures, making any security classification by Node.js alone ineffective.
198+
In Node.js, the stack space usage from JavaScript function calls is primarily implemented by V8. JavaScript engines developed for browsers have a different security model, and they do not triage crashes like this as security vulnerabilities This means similar behavior inconsistencies reported in the upstream (like [this](https://issues.chromium.org/issues/432385241)) are not guaranteed to go through vulnerability disclosure procedures, making any security classification by Node.js alone ineffective.
198199

199200
### uncaughtException Limitations
200201

@@ -204,8 +205,8 @@ Trying to invoke the handler after the call stack size is exceeded would itself
204205

205206
### Why We Put It In a Security Release
206207

207-
Although it is a bug fix for an unspecified behavior, we chose to include it in the security release because of its widespread impact on the ecosystem.
208-
React Server Components, Next.js, and virtually every APM tool are affected. The fix improves developer experience and makes error handling more predictable.
208+
Although it is a patch for an unspecified behavior, we chose to include it in the security release because of its widespread impact on the ecosystem. The prevalence of `async_hooks` usage by React Server Components, Next.js, and APM tools makes it a practical Denial-of-Service vector for many applications.
209+
Making the unspecified behavior more consistent in Node.js improves developer experience and makes error handling more predictable.
209210

210211
However, it's important to note that we were fortunate to be able to fix this particular case. There's no guarantee that similar edge cases involving stack overflow and `async_hooks` can always be addressed. **For mission-critical paths that must defend against infinite recursion or stack overflow from recursion whose depth can be controlled by an attacker, always sanitize the input or impose a limit on the depth of recursion by other means**.
211212

0 commit comments

Comments
 (0)