fix(server): prevent segfault when server JS wrapper is finalized during request handling#28246
fix(server): prevent segfault when server JS wrapper is finalized during request handling#28246
Conversation
…ing request handling After server.stop() downgrades the JS reference from strong to weak, GC can collect the wrapper while uWS route handlers are still registered. The old jsValueAssertAlive() used `.?` to unwrap the nullable value, which is undefined behavior in release builds when null — causing a segfault at address 0x0. Replace jsValueAssertAlive() with tryGetJSValue() returning ?JSValue, and handle the null case at all call sites by returning 503 Service Unavailable (for uWS callbacks) or a rejected promise (for fetch()). Fixes #22809 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Updated 10:35 AM PT - Mar 18th, 2026
❌ Your commit
🧪 To try this PR locally: bunx bun-pr 28246That installs a local version of the PR into your bun-28246 --bun |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughModified server.zig to replace the Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. 📝 Coding Plan
Comment |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| const server_js = this.tryGetJSValue() orelse { | ||
| // Server JS wrapper was finalized; reject with 503. | ||
| resp.writeStatus("503 Service Unavailable"); | ||
| resp.endWithoutBody(true); | ||
| return; | ||
| }; |
There was a problem hiding this comment.
🔴 Bug: In onSavedRequest, the tryGetJSValue() null check (line 2143) is placed AFTER prepareJsRequestContext has already been called (line 2133), which increments pending_requests, allocates a RequestContext from the pool, creates an AbortSignal, and creates a Request object. The early 503 return at line 2147 leaks all of these resources and never decrements pending_requests, which could prevent server shutdown. Additionally, the fetch() path at line 1254 leaks a heap-allocated Request created at line 1251. Fix: move tryGetJSValue() checks before the resource allocations, matching the pattern used in onRequest (line 2103), onUserRouteRequest, and the upgrade handlers.
Extended reasoning...
Resource Leak in onSavedRequest
In onSavedRequest (line 2124), the .stack case calls prepareJsRequestContext at line 2133 to allocate request-handling resources. Looking at prepareJsRequestContext (line 2215), this function performs several allocations before returning:
- Increments
pending_requestsviaonPendingRequest()at line 2252 - Allocates a
RequestContextfrom the pool at line 2272 - Creates an
AbortSignalwithpendingActivityRef()at lines 2278-2280 - Creates a
Requestobject at lines 2282-2288
The tryGetJSValue() null check at line 2143 happens AFTER all these allocations. When it returns null (server JS wrapper was GC-collected after stop()), the early return at line 2147 sends a 503 response but skips all cleanup that normally occurs later in the function: ctx.deinit() (line 2171), detachRequest() (line 2159), onResponse() (line 2164), and the pending_requests decrement.
Contrast with Correct Handlers
All other handlers in this PR correctly place the tryGetJSValue() check BEFORE prepareJsRequestContext. For example, onRequest at line 2103 checks tryGetJSValue() first, then calls prepareJsRequestContext at line 2110. Similarly, onUserRouteRequest (line 2055) and the upgrade handlers (line 2343) follow the same safe ordering. Only onSavedRequest has the check in the wrong position.
Step-by-Step Proof
- A bake DevServer triggers
onSavedRequestwith a.stackrequest prepareJsRequestContextis called at line 2133, which incrementspending_requests(now N+1), allocates aRequestContextfrom the pool, creates anAbortSignalwith a pending activity ref, and creates aRequestobject- At line 2143,
tryGetJSValue()returns null because the server JS wrapper was already GC-collected afterserver.stop() - The function returns early at line 2147 after writing 503
pending_requestsis still N+1 and is never decremented back to N- The
RequestContextpool allocation is never returned - The
AbortSignalpendingActivityRefis never released request_object.request_context.detachRequest()is never called
Request Leak in fetch() Path
A secondary issue exists in the fetch() code path: Request.new(existing_request) at line 1251 heap-allocates a *Request via bun.new(). On the normal path, this is transferred to JS GC ownership via request.toJS() at line 1260. But when tryGetJSValue() returns null at line 1254, the *Request pointer is dropped without being freed. The Request struct contains allocated resources (URL string, headers, body) that are also leaked. This path is less likely to be reached in practice since fetch() is a JS prototype method requiring a live reference, but the ordering is still incorrect.
Impact and Fix
The pending_requests leak is the most impactful issue: it can prevent server.stop() from resolving because deinitIfWeCan checks that pending_requests reaches 0. The pool and signal leaks waste memory. The fix is straightforward: move the tryGetJSValue() check to before the switch statement at line 2132 (before prepareJsRequestContext is called for the .stack case). For the fetch() path, move the check to before Request.new() at line 1251. This matches the pattern used in all other correctly-ordered handlers.
| /// Returns the server's JS wrapper value, or null if it has been | ||
| /// finalized (e.g. after stop() + GC). Callers that originate from | ||
| /// uWS callbacks must handle the null case gracefully (e.g. 503). | ||
| pub fn tryGetJSValue(server: *ThisServer) ?jsc.JSValue { | ||
| return server.js_value.tryGet(); | ||
| } | ||
|
|
||
| pub fn requestIP(this: *ThisServer, request: *jsc.WebCore.Request) bun.JSError!jsc.JSValue { |
There was a problem hiding this comment.
🟡 Nit: onNodeHTTPRequestWithUpgradeCtx (line 1886) still uses this.js_value.tryGet() orelse .js_undefined instead of the new tryGetJSValue() + 503 pattern used by every other uWS callback in this PR. While this won't segfault, passing undefined as thisObject to the JS callback during the shutdown race produces confusing JS errors instead of a clean 503. Note that onPendingRequest() is called before this line, so a fix would also need to decrement pending_requests on the early-return path.
Extended reasoning...
What the bug is
The PR introduces tryGetJSValue() and updates all uWS callback paths to return a 503 response when the server's JS wrapper has been finalized during shutdown. However, onNodeHTTPRequestWithUpgradeCtx at line 1886 still uses the old pattern:
const thisObject: JSValue = this.js_value.tryGet() orelse .js_undefined;instead of:
const server_js = this.tryGetJSValue() orelse {
resp.writeStatus("503 Service Unavailable");
resp.endWithoutBody(true);
return;
};Code path that triggers it
This function is called from two places: onNodeHTTPRequest (line 2018) and the upgrade handler (line 2368). Both are uWS callbacks that can fire after server.stop() + GC collects the JS wrapper. When this.js_value.tryGet() returns null, thisObject becomes .js_undefined, which is then passed as the thisObject argument to the Node HTTP request handler JS function at line 1899-1912.
Why this doesn't crash but is inconsistent
The code was already safe from segfault before this PR — it used tryGet() orelse .js_undefined rather than the dangerous .? unwrap that jsValueAssertAlive() used. The JS callback at line 1912 catches exceptions via globalThis.takeException(), so the undefined thisObject will likely produce a caught JS error. However, this is inconsistent with every other handler in the PR, which returns a clean 503 to the client.
Complication with pending_requests
Note that onPendingRequest() is called at line 1873 (incrementing this.pending_requests) before the js_value check at line 1886. Compare this with the handler at line 2378-2383 where tryGetJSValue() is checked before pending_requests += 1. If this call site were updated to return 503 early, the fix would also need to either: (a) move the onPendingRequest() call after the check, or (b) decrement pending_requests on the early-return path. This is likely why the call site was not updated — it requires slightly more restructuring than the others.
Step-by-step proof
- User creates a Node HTTP server with
Bun.serve()using the node:http compatibility path (onNodeHTTPRequestcallback). - User calls
server.stop(), which downgrades the JS reference to weak. - GC runs and collects the JS wrapper, making
this.js_value.tryGet()return null. - A new HTTP request arrives, uWS dispatches to
onNodeHTTPRequest→onNodeHTTPRequestWithUpgradeCtx. - Line 1873:
pending_requestsis incremented. - Line 1886:
this.js_value.tryGet()returns null,thisObjectbecomes.js_undefined. - Line 1899-1912: The Node HTTP handler is called with
undefinedasthisObject, producing a confusing JS error instead of a 503. - All other handlers in the PR would have returned 503 at step 6.
Addressing the refutation
One verifier correctly notes this function was not modified by the PR and was never using jsValueAssertAlive(). This is true — the code was already null-safe. However, the PR's stated goal is "All 8 call sites now handle the null case gracefully," and this is a 9th call site facing the identical race condition that the PR addresses everywhere else. The severity is appropriately a nit since it's pre-existing and won't crash.
Summary
server.stop()+ GC collects the server's JS wrapperjsValueAssertAlive()(which used.?unwrap — UB when null in release builds) with safetryGetJSValue()returning?JSValuefetch()path returns a rejected promiseTest plan
test/regression/issue/22809.test.tsbun bd test -t "should be able to await server.stop",bun bd test -t "should be able to abruptly stop")Fixes #22809
🤖 Generated with Claude Code