Skip to content

fix(apptainer): use unix socket for Redis so host:6379 can't shadow us#389

Merged
t0mdavid-m merged 1 commit into
mainfrom
claude/singularity-bind-mountpoints
May 15, 2026
Merged

fix(apptainer): use unix socket for Redis so host:6379 can't shadow us#389
t0mdavid-m merged 1 commit into
mainfrom
claude/singularity-bind-mountpoints

Conversation

@t0mdavid-m
Copy link
Copy Markdown
Member

@t0mdavid-m t0mdavid-m commented May 15, 2026

Diagnostic from a user reproducing the workflow EROFS revealed the real chain of failure under singularity:

Starting Redis server (data=/tmp/openms-runtime-452993/redis)...
Redis is ready
Starting 1 RQ worker(s)...
Starting Streamlit app (cwd=/app, uid=1000)...
ERROR:root:There exists an active worker named 'worker-1' already

Apptainer/singularity share the host's network namespace by default. When the host has anything listening on 6379 — a system redis-server, a docker container, a previous singularity instance that didn't clean up — our redis-server --daemonize yes silently fails to bind with EADDRINUSE, but because daemonize forks before the listen-error surfaces, the entrypoint's parent shell returns 0 and the subsequent redis-cli ping happily connects to the host's redis instead.

From there:

  • RQ tries to register worker-1 against the host's redis → conflicts with stale state from a previous run, the worker dies.
  • Streamlit enqueues to the host's redis; the workflow job is consumed by whatever stale worker is still alive on the host, which runs the mkdir outside our mount namespace (no /workspaces-streamlit-template bind there) and hits EROFS at the squashfs root.

Unix-socket sidesteps the entire problem class: when the entrypoint detects read-only-root (apptainer mode), it now starts redis with --unixsocket $RUNTIME_DIR/redis.sock --port 0 (no TCP listener at all) and exports REDIS_URL=unix://<socket> so streamlit's QueueManager and the RQ worker can only connect to our redis. docker mode is unchanged (TCP 6379 on localhost as before, no socket).

Also: write the resolved URL to /tmp/openms-redis-url so apptainer exec can discover it for diagnostics (env doesn't propagate across exec invocations). The test-apptainer CI step now reads that marker and pings with redis-cli -s <sock> accordingly.

Summary by CodeRabbit

  • Chores
    • Added support for Unix socket-based Redis connectivity in containerized environments, enabling Redis operation in read-only container modes.
    • Enhanced container startup to automatically detect and configure Redis connectivity with improved verification for both socket and TCP connections.

Review Change Stack

Diagnostic from a user reproducing the workflow EROFS revealed the real
chain of failure under singularity:

  Starting Redis server (data=/tmp/openms-runtime-452993/redis)...
  Redis is ready
  Starting 1 RQ worker(s)...
  Starting Streamlit app (cwd=/app, uid=1000)...
  ERROR:root:There exists an active worker named 'worker-1' already

Apptainer/singularity share the host's network namespace by default.
When the host has anything listening on 6379 — a system redis-server,
a docker container, a previous singularity instance that didn't clean
up — our `redis-server --daemonize yes` silently fails to bind with
EADDRINUSE, but because daemonize forks before the listen-error
surfaces, the entrypoint's parent shell returns 0 and the subsequent
`redis-cli ping` happily connects to the *host's* redis instead.

From there:
- RQ tries to register `worker-1` against the host's redis → conflicts
  with stale state from a previous run, the worker dies.
- Streamlit enqueues to the host's redis; the workflow job is consumed
  by whatever stale worker is still alive on the host, which runs the
  mkdir outside our mount namespace (no /workspaces-streamlit-template
  bind there) and hits EROFS at the squashfs root.

Unix-socket sidesteps the entire problem class: when the entrypoint
detects read-only-root (apptainer mode), it now starts redis with
`--unixsocket $RUNTIME_DIR/redis.sock --port 0` (no TCP listener at
all) and exports `REDIS_URL=unix://<socket>` so streamlit's
QueueManager and the RQ worker can only connect to *our* redis.
docker mode is unchanged (TCP 6379 on localhost as before, no socket).

Also: write the resolved URL to /tmp/openms-redis-url so `apptainer
exec` can discover it for diagnostics (env doesn't propagate across
exec invocations). The test-apptainer CI step now reads that marker
and pings with `redis-cli -s <sock>` accordingly.
@t0mdavid-m t0mdavid-m merged commit bce2e27 into main May 15, 2026
5 of 6 checks passed
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e43f8109-2d49-4aa8-baf4-0c175c51629f

📥 Commits

Reviewing files that changed from the base of the PR and between 9128698 and a40e0c6.

📒 Files selected for processing (2)
  • .github/workflows/build-and-test.yml
  • docker/entrypoint.sh

📝 Walkthrough

Walkthrough

This PR extends Docker entrypoint and CI workflow to support Redis over unix-domain sockets in read-only Apptainer/Singularity containers. The entrypoint configures a socket-based Redis server and writes its URL to a shared file; the workflow then reads and uses that URL for reachability verification.

Changes

Unix Socket Redis Support for Apptainer

Layer / File(s) Summary
Docker entrypoint socket setup and startup
docker/entrypoint.sh
In read-only mode, the entrypoint provisions a runtime directory and unix socket under /tmp, exports REDIS_URL in unix:// format for RQ workers, and writes the resolved URL to /tmp/openms-redis-url. Redis startup conditionally uses --unixsocket and --port 0 when REDIS_SOCKET is set, or reverts to TCP configuration otherwise. Readiness checks adapt to use socket-based redis-cli arguments in socket mode.
Workflow reachability verification
.github/workflows/build-and-test.yml
The Apptainer workflow's Redis reachability step reads the URL from /tmp/openms-redis-url and branches on socket vs. TCP: for unix:// URLs it extracts the socket path and runs redis-cli -s <socket>, otherwise it runs default redis-cli ping.

Possibly Related PRs

  • OpenMS/streamlit-template#387: Main PR's unix-socket URL discovery and socket-aware workflow check are direct follow-up to this PR's introduction of Apptainer read-only mode support in docker/entrypoint.sh and the initial Apptainer CI workflow.
  • OpenMS/streamlit-template#386: Both PRs modify docker/entrypoint.sh to handle Apptainer/read-only scenarios using /tmp-based runtime state for Redis configuration.

Poem

🐰 A socket, a script, and a workflow so neat,
We disco'r Redis URLs at each checkpoint we meet,
Read-only modes spinning in Apptainer's embrace—
No TCP needed, just sockets in place!

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/singularity-bind-mountpoints

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants