URL-test: await external plugin loopback readiness (#46)#46
Conversation
…ed 500ms URL-testing external-plugin profiles (naive/mieru/masterdnsvpn) raced the sidecar: sing-box dialed the plugin's loopback SOCKS port before it had bound, surfacing a flaky 'connection refused'. The live VPN path already polls readiness via awaitExternalProcessesReady(); the URL test still used a fixed delay(500). Reuse awaitExternalProcessesReady(strict = true) in TestInstance: poll each plugin loopback until it accepts (bounded by the connection-test timeout; MasterDnsVPN gets its longer window), and on timeout report a clear 'sidecar listener not ready' error instead of a misleading connection failure. Verified on-device: naive/mieru now show real latency (189/194ms) where they previously flaked; a genuinely slow/unbound sidecar yields the clear not-ready error.
📝 WalkthroughWalkthrough
ChangesStrict sidecar readiness for URL testing
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Comment |
Per Greptile: in strict mode the non-MasterDnsVPN readiness window reused the full connectionTestTimeout, so a slow-binding sidecar could roughly double the perceived URL-test duration (readiness wait + the test itself). Cap the strict readiness wait at min(2s, timeout); a healthy sidecar binds well under a second. MasterDnsVPN keeps its required 60s window.
|
Addressed in 195fff1: capped the strict (URL-test) readiness window at min(2s, connectionTestTimeout) so a slow sidecar can't ~double the perceived test time. MasterDnsVPN keeps its required 60s window (its readiness is fatal, not best-effort). |
Summary
URL-test reliability fix. External-plugin profiles (naive / mieru / masterdnsvpn) intermittently
showed "connection refused" during URL Test because the test dialed the plugin's local SOCKS
loopback before the sidecar had bound it. This was a pre-existing race (proven by A/B against a
pre-#44 build) — not a regression — but worth fixing.
The live VPN-start path already waits for sidecar readiness via
awaitExternalProcessesReady()(polls each plugin loopback port until it accepts). The URL test still used a fixed
delay(500),which often lost the race under concurrent testing.
Changes
BoxInstance.awaitExternalProcessesReady(strict = false)— add astrictflag. When strict(URL test), a sidecar that never binds within the readiness window is a clear error
(
sidecar listener not ready on port(s): …) instead of the live-path behavior of logging andcontinuing (the live path keeps a long-lived connection sing-box retries; a one-shot test has
no such retry).
TestInstance.doTest()— replacedelay(500)withawaitExternalProcessesReady(strict = true).Acceptance
External-plugin URL tests should show real latency or a clear "plugin listener not ready"
error, not flaky "connection refused".
Testing
"Unavailable"/refused); a genuinely slow/unbound sidecar (MasterDnsVPN under concurrent test)
yields the clear
sidecar listener not readyerror rather than "connection refused". Nativeprofiles (Hysteria2, AmneziaWG) unaffected.
Notes
concurrent testing it can still time out and report "not ready" — which is the intended clear
signal. It connects normally when actually used (single instance, no concurrency).
Greptile Summary
Replaces the fixed
delay(500)inTestInstance.doTest()withawaitExternalProcessesReady(strict = true), eliminating the race between URL-test dialling and an external plugin sidecar binding its loopback SOCKS port.BoxInstance.awaitExternalProcessesReady(strict: Boolean = false)— adds astrictflag that (a) caps the non-MasterDnsVPN readiness window atmin(2_000, max(1_000, connectionTestTimeout))ms and (b) promotes a timeout from a warning-and-continue into a hardIOExceptionwith a descriptive message.TestInstance.doTest()— swaps the blind 500 ms sleep forawaitExternalProcessesReady(strict = true)inside the existingprocesses.processCount > 0guard; the now-unusedkotlinx.coroutines.delayimport is removed.Confidence Score: 5/5
Safe to merge — the change is narrowly scoped to the URL-test path and leaves the live VPN-start path unmodified.
Both changed files have clear, correct logic. The
strictflag is defaulted tofalseso existing callers ofawaitExternalProcessesReady()(the live-start path) are unaffected. The new 2 s cap for non-MasterDnsVPN strict mode is a straightforward improvement over the previous unbounded wait, and exception propagation throughtryResumeWithExceptioninTestInstanceis handled correctly. No data-loss, auth, or crash risk is introduced.No files require special attention.
Important Files Changed
strictparameter toawaitExternalProcessesReady(): caps readiness window at 2 s for non-MasterDnsVPN profiles and promotes a timeout to a hardIOExceptionin strict mode.delay(500)guard beforeurlTestwithawaitExternalProcessesReady(strict = true), and removes the now-unusedkotlinx.coroutines.delayimport.Sequence Diagram
%%{init: {'theme': 'neutral'}}%% sequenceDiagram participant U as Caller participant TI as TestInstance.doTest() participant BI as BoxInstance.awaitExternalProcessesReady(strict=true) participant SP as Sidecar Process (Naive/Mieru/MasterDnsVPN) participant LC as Libcore.urlTest() U->>TI: doTest() TI->>TI: init() + launch() alt "processes.processCount > 0" TI->>BI: "awaitExternalProcessesReady(strict=true)" loop poll every 50ms until deadline BI->>SP: TCP connect(loopback:port, 100ms) alt port accepts SP-->>BI: connected - remove from pending else refused SP-->>BI: IOException (not ready yet) end end alt all ports ready BI-->>TI: return (success) else "timeout and strict=true" BI-->>TI: throw IOException(sidecar listener not ready on port(s)) TI-->>U: tryResumeWithException(e) end end TI->>LC: urlTest(box, link, timeout) LC-->>TI: latency ms TI-->>U: tryResume(latency)%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% sequenceDiagram participant U as Caller participant TI as TestInstance.doTest() participant BI as BoxInstance.awaitExternalProcessesReady(strict=true) participant SP as Sidecar Process (Naive/Mieru/MasterDnsVPN) participant LC as Libcore.urlTest() U->>TI: doTest() TI->>TI: init() + launch() alt "processes.processCount > 0" TI->>BI: "awaitExternalProcessesReady(strict=true)" loop poll every 50ms until deadline BI->>SP: TCP connect(loopback:port, 100ms) alt port accepts SP-->>BI: connected - remove from pending else refused SP-->>BI: IOException (not ready yet) end end alt all ports ready BI-->>TI: return (success) else "timeout and strict=true" BI-->>TI: throw IOException(sidecar listener not ready on port(s)) TI-->>U: tryResumeWithException(e) end end TI->>LC: urlTest(box, link, timeout) LC-->>TI: latency ms TI-->>U: tryResume(latency)Comments Outside Diff (1)
app/src/main/java/io/nekohasekai/sagernet/bg/proto/BoxInstance.kt, line 279-283 (link)readinessTimeoutMsfor non-MasterDnsVPN profiles ismaxOf(1_000L, connectionTestTimeout)— the same value passed toLibcore.urlTest. In the worst case (sidecar never binds), a user with the default 3 000 ms timeout will now wait up to 3 s for readiness and then 3 s for the URL test itself, for a total of ~6 s. Previously the fixed 500 ms delay kept the total close to the configured timeout. For users who set a tighter timeout this doubling may be surprising. Consider capping the readiness window to a fraction of the URL-test timeout (e.g. half), or using a separate, smaller constant for the non-MasterDnsVPN strict case.Reviews (3): Last reviewed commit: "ci: trigger build on latest commit" | Re-trigger Greptile