fix(usockets): snapshot loop before context unref in us_connecting_socket_free#29064
fix(usockets): snapshot loop before context unref in us_connecting_socket_free#29064cirospaciari wants to merge 1 commit intomainfrom
Conversation
…cket_free us_internal_socket_context_unlink_connecting_socket() calls us_socket_context_unref(), which can drop the last reference to the context. The next two lines then dereference c->context to reach the loop. Snapshot the loop pointer before unlink, matching the existing idiom in start_connections (context.c:596-597). Part of hardening for the v1.3.11 loop.c:238 crash signature (null deref in the DNS drain path). No behavioral change on the happy path.
|
Updated 9:45 PM PT - Apr 8th, 2026
❌ @cirospaciari, your commit 52b45bd has some failures in 🧪 To try this PR locally: bunx bun-pr 29064That installs a local version of the PR into your bun-29064 --bun |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughModified Changes
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
LGTM — snapshotting the loop pointer before the unref is the correct fix, matching the existing idiom in context.c.
Extended reasoning...
Overview
Single-line change in packages/bun-usockets/src/socket.c: snapshots c->context->loop into a local loop variable before calling us_internal_socket_context_unlink_connecting_socket(), then uses the snapshot for the two subsequent pointer writes.
Security Risks
None. This is a defensive fix against a use-after-free/dangling-pointer dereference; it does not change any security boundary or authentication path.
Level of Scrutiny
Low. The fix is mechanical and confined to four lines. The root cause is clear: us_internal_socket_context_unlink_connecting_socket() unconditionally calls us_socket_context_unref() (context.c:152), which can drop the last reference and queue the context for free. The two subsequent accesses via c->context->loop would then dereference a potentially-freed pointer. Snapshotting the loop pointer before the unlink call is the correct and minimal remedy, and it exactly mirrors the snapshot idiom already used in context.c:596-597.
Other Factors
No bugs were identified by the automated review. The change is single-file, has no side-effects on callers, and the description provides adequate rationale and testing notes (ASAN stress test, existing test suite passes).
…_callback (#29067) ## What Take an explicit \`us_socket_context_ref\` in \`us_socket_context_connect\` immediately after setting \`c->pending_resolve_callback = 1\` (before \`Bun__addrinfo_set\` exposes \`c\` to the DNS worker), and drop it at every exit of \`us_internal_socket_after_resolve\`. Snapshot \`context\`/\`ssl\` into locals at the top of \`after_resolve\` so the unref uses a stable pointer even after \`us_connecting_socket_free\`/\`close\` has unlinked \`c\`. ## Why v1.3.11 crash signature, linux x86_64_baseline: \`\`\` Segmentation fault at address 0x00000000 - loop.c:238: us_loop_run_bun_tick \`\`\` \`loop\` is the first field of \`us_socket_context_t\`, so address 0x0 means \`c->context\` was NULL when the inlined \`after_resolve\` ran \`c->context->loop->num_polls--\`. The existing invariant relies on the link-ref taken by \`us_internal_socket_context_link_connecting_socket\`; this PR makes the ref independent of link/unlink so the context is guaranteed live for the entire pending-resolve window regardless of which list \`c\` is on. Features in the report: spawn, fetch, http_client_proxy, WebSocket, abort_signal. \`Bun.spawnSync\` (whose tick changed in v1.3.11 from \`tickWithoutJS\` to \`tickTasksOnly\`) lets \`dns_ready_head\` accumulate while the JS thread is blocked, widening the window. ## Testing No deterministic repro on macOS — the production crash is on linux with the work-pool getaddrinfo path; an ASAN stress test (concurrent multi-IP WebSocket + spawnSync + abort, 400 iters) runs clean on the debug build. Existing suites pass on the debug build: socket.test.ts (29/29), node-net.test.ts (31/31+1 skip), resolve-dns.test.ts (71/71). ## Related - #29064 snapshots \`loop\` before unref in \`us_connecting_socket_free\` (independent ordering fix) - #29065 takes the ref inside \`after_resolve\` instead — this PR supersedes it by acquiring the ref before the DNS thread can observe \`c\`
us_internal_socket_context_unlink_connecting_socket() calls us_socket_context_unref(), which can drop the last reference and queue the context for free. The two lines that follow then dereference c->context->loop. Snapshot the loop pointer before unlink runs.
Matches the existing snapshot idiom in start_connections (context.c:596-597).
Part of defensive hardening for the v1.3.11 crash signature (segfault at 0x0 inside us_loop_run_bun_tick at loop.c:238, the inlined us_internal_socket_after_resolve / DNS drain path). No deterministic repro; ASAN stress test (WebSocket + cancel + spawnSync, 50 iters) runs clean on the debug build. Existing socket / node:net / DNS / proxy tests pass (proxy redirect timeouts are pre-existing on main).
Companion PR: ref/unref the context across us_internal_socket_after_resolve.