fix(apptainer): use unix socket for Redis so host:6379 can't shadow us#389
Conversation
Diagnostic from a user reproducing the workflow EROFS revealed the real chain of failure under singularity: Starting Redis server (data=/tmp/openms-runtime-452993/redis)... Redis is ready Starting 1 RQ worker(s)... Starting Streamlit app (cwd=/app, uid=1000)... ERROR:root:There exists an active worker named 'worker-1' already Apptainer/singularity share the host's network namespace by default. When the host has anything listening on 6379 — a system redis-server, a docker container, a previous singularity instance that didn't clean up — our `redis-server --daemonize yes` silently fails to bind with EADDRINUSE, but because daemonize forks before the listen-error surfaces, the entrypoint's parent shell returns 0 and the subsequent `redis-cli ping` happily connects to the *host's* redis instead. From there: - RQ tries to register `worker-1` against the host's redis → conflicts with stale state from a previous run, the worker dies. - Streamlit enqueues to the host's redis; the workflow job is consumed by whatever stale worker is still alive on the host, which runs the mkdir outside our mount namespace (no /workspaces-streamlit-template bind there) and hits EROFS at the squashfs root. Unix-socket sidesteps the entire problem class: when the entrypoint detects read-only-root (apptainer mode), it now starts redis with `--unixsocket $RUNTIME_DIR/redis.sock --port 0` (no TCP listener at all) and exports `REDIS_URL=unix://<socket>` so streamlit's QueueManager and the RQ worker can only connect to *our* redis. docker mode is unchanged (TCP 6379 on localhost as before, no socket). Also: write the resolved URL to /tmp/openms-redis-url so `apptainer exec` can discover it for diagnostics (env doesn't propagate across exec invocations). The test-apptainer CI step now reads that marker and pings with `redis-cli -s <sock>` accordingly.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR extends Docker entrypoint and CI workflow to support Redis over unix-domain sockets in read-only Apptainer/Singularity containers. The entrypoint configures a socket-based Redis server and writes its URL to a shared file; the workflow then reads and uses that URL for reachability verification. ChangesUnix Socket Redis Support for Apptainer
Possibly Related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Diagnostic from a user reproducing the workflow EROFS revealed the real chain of failure under singularity:
Starting Redis server (data=/tmp/openms-runtime-452993/redis)...
Redis is ready
Starting 1 RQ worker(s)...
Starting Streamlit app (cwd=/app, uid=1000)...
ERROR:root:There exists an active worker named 'worker-1' already
Apptainer/singularity share the host's network namespace by default. When the host has anything listening on 6379 — a system redis-server, a docker container, a previous singularity instance that didn't clean up — our
redis-server --daemonize yessilently fails to bind with EADDRINUSE, but because daemonize forks before the listen-error surfaces, the entrypoint's parent shell returns 0 and the subsequentredis-cli pinghappily connects to the host's redis instead.From there:
worker-1against the host's redis → conflicts with stale state from a previous run, the worker dies.Unix-socket sidesteps the entire problem class: when the entrypoint detects read-only-root (apptainer mode), it now starts redis with
--unixsocket $RUNTIME_DIR/redis.sock --port 0(no TCP listener at all) and exportsREDIS_URL=unix://<socket>so streamlit's QueueManager and the RQ worker can only connect to our redis. docker mode is unchanged (TCP 6379 on localhost as before, no socket).Also: write the resolved URL to /tmp/openms-redis-url so
apptainer execcan discover it for diagnostics (env doesn't propagate across exec invocations). The test-apptainer CI step now reads that marker and pings withredis-cli -s <sock>accordingly.Summary by CodeRabbit