Skip to content

Daemon startup crashes on Windows when pruning a stale state file (EPERM unlink) #94

@Rinse12

Description

@Rinse12

Symptom

Windows CI intermittently fails test/cli/daemon.test.ts > bitsocial daemon can use a kubo node started by another program with:

Daemon failed to start: EPERM: operation not permitted, unlink
'C:\Users\runneradmin\AppData\Local\bitsocial\Data\.daemon_states\9244-daemon.state'

The spawned daemon exits 1 and the test throws spawnAsync process exited with code '1'. Only windows-latest is affected; ubuntu and macos pass.

Root cause

On daemon startup, daemon.ts calls await pruneStaleStates() (unguarded). That walks getAliveDaemonStates() and, for each dead PID, calls deleteDaemonState()fs.unlink(). The catch there only swallows ENOENT, so any other error propagates and aborts startup.

On Windows, unlinking a file that another process still has open (or that is in "delete-pending" state) returns EPERM/EACCES/EBUSY — unlike POSIX, where unlink of an open file succeeds. Concurrent daemons share the global .daemon_states dir and race to prune the same dead-PID file; the loser of that race used to crash.

Fix

Pruning a stale file is best-effort cleanup and must never be fatal. Make deleteDaemonState tolerate EPERM/EACCES/EBUSY in addition to ENOENT (another daemon reclaims the file on its next prune). Add a regression unit test that mocks fs.unlink to throw EPERM and asserts deleteDaemonState resolves.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions