feat: add healthcheck to execution service and wait condition for consensus by erhnysr · Pull Request #1005 · base/node

erhnysr · 2026-04-11T14:15:58Z

Problem

When running docker compose up, the node service (consensus client) starts immediately after the execution container is created — not after it's actually ready to serve requests. On first boot or after a restart with a large database, the execution client can take 30–120 seconds before its JSON-RPC becomes available. During this window the consensus service repeatedly fails to connect and enters a crash-loop.

This is a frequently reported issue in the #🛠｜node-operators Discord channel.

Solution

Add a healthcheck to the execution service that polls eth_syncing via JSON-RPC. The check passes as soon as the RPC endpoint responds (the node does not need to be fully synced — just started).
Change depends_on on the node service to condition: service_healthy so the consensus client only starts once the execution client's RPC is live.

Healthcheck parameters

Parameter	Value	Reason
`interval`	30s	Re-poll every 30 seconds
`timeout`	10s	Single-request timeout
`retries`	5	Mark unhealthy after 5 consecutive failures
`start_period`	60s	Grace window for slow DB init on first boot

Testing

Verified with CLIENT=reth and CLIENT=geth. The node service now waits correctly on fresh starts and after docker compose restart execution.

Backwards compatibility

No changes to .env files or entrypoints. Existing deployments require no migration.

…sensus Previously the consensus node would start immediately after the execution client container was created, without waiting for its JSON-RPC to become available. On first boot or after a restart, this caused the consensus service to crash-loop while the execution client was still initialising. Changes: - Add healthcheck to execution service that polls eth_syncing via JSON-RPC. The check passes as soon as the RPC endpoint responds, confirming the client is fully booted (node does not need to be fully synced). - Change depends_on on the node service to condition: service_healthy so the consensus client only starts once the execution client is ready. Healthcheck parameters: interval: 30s - re-poll every 30 seconds timeout: 10s - single-request timeout retries: 5 - mark unhealthy after 5 consecutive failures start_period: 60s - grace window for slow database init on first boot Backwards-compatible: no changes to .env files or entrypoints required.

cb-heimdall · 2026-04-11T14:16:02Z

🟡 Heimdall Review Status

Requirement Status More Info

Reviews

🟡 0/1

Denominator calculation

Show calculation

1 if user is bot 0

1 if user is external 0

2 if repo is sensitive 0

From .codeflow.yml 1

Additional review requirements

Show calculation

Max	0

0

From CODEOWNERS 0

Global minimum 0

Max 1

1

1 if commit is unverified 1

Sum 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add healthcheck to execution service and wait condition for consensus#1005

feat: add healthcheck to execution service and wait condition for consensus#1005
erhnysr wants to merge 1 commit intobase:mainfrom
erhnysr:feat/docker-healthcheck-execution-client

erhnysr commented Apr 11, 2026

Uh oh!

cb-heimdall commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

erhnysr commented Apr 11, 2026

Problem

Solution

Healthcheck parameters

Testing

Backwards compatibility

Uh oh!

cb-heimdall commented Apr 11, 2026

🟡 Heimdall Review Status

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants