Skip to content

Flaky daemon tests: bind-race retry predicate misses the daemon's 'PKC RPC port became occupied' fail-fast (lineage #87) #97

@Rinse12

Description

@Rinse12

Symptom

Different test/cli/daemon.test.ts tests red on ubuntu/macOS roughly every push. Latest: bitsocial daemon kubo restart cleanup > stops kubo when daemon exits during a restart cycle (run 27457703801, ubuntu) failing at startup:

Daemon failed to start: PKC RPC port localhost:37141 became occupied before the daemon could bind it.

The daemon subprocess exits 1 before the scenario under test ever runs.

Root cause

Two layers:

  1. TOCTOU port allocation (Test flake: hardcoded kubo API ports in macOS ephemeral range cause intermittent 'address already in use' #87): allocateFreePort binds :0, reads the port, closes the socket, then hands the number to the daemon. Under fileParallelism (many test forks each spawning kubo + a daemon on a 2-vCPU runner) another process grabs the port in the close->rebind window.

  2. Retry net has a hole where the daemon's own race-detector fires (actionable): startPkcDaemonWithDynamicPorts retries with a fresh port set, but only when the error matches isAddressInUseError:

    // test/helpers/daemon-helpers.ts:258
    return /address already in use|EADDRINUSE/i.test(message);

    The daemon's pre-bind TOCTOU guard (src/cli/commands/daemon.ts:486) throws a different message — PKC RPC port … became occupied before the daemon could bind it — which matches neither pattern. So isAddressInUseError returns false, the wrapper rethrows instead of retrying, and the test fails hard.

A fail-fast guard and the retry predicate meant to catch it have drifted out of sync.

Fix

Extend isAddressInUseError to also recognise the daemon's fail-fast wording so a lost RPC-port race retries like any other bind race:

return /address already in use|EADDRINUSE|became occupied before the daemon could bind it/i.test(message);

Test-helper-only; no production behaviour change. Verify with a full local npm run test:cli.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions