Skip to content

Watchdog v1 (compare mode) + release bundle versioning#14

Open
stephenctw wants to merge 13 commits into
mainfrom
feature/watch-dog
Open

Watchdog v1 (compare mode) + release bundle versioning#14
stephenctw wants to merge 13 commits into
mainfrom
feature/watch-dog

Conversation

@stephenctw

@stephenctw stephenctw commented May 11, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR delivers two related scopes as one release:

1. Watchdog v1 — compare mode

Off-chain watchdog that compares sequencer finalized SSZ (GET /finalized_state) against canonical Cartesi Machine inspect at the same L1 inclusion block.

  • Compare path: Lua runner + machine_cartesi (prod) / machine_cli (harness); advance checkpointing retained
  • SSZ parity: shared wallet_snapshot encoding between sequencer, scheduler, and CM
  • L1 partition parity: shared fixture vector (tests/fixtures/l1_partition_vector.json) for Lua + Rust
  • Divergence signal: structured watchdog_event JSON on stderr; exit codes 0 ok / 1 transient / 2 deterministic mismatch; webhook/alarm.lua removed
  • Terminal errors: no retry on state_mismatch / inclusion_block_regressed
  • E2E: watchdog_genesis_compare_test + non-genesis compare at end of deposit_transfer_withdrawal_test; CI builds lcurl.so via just watchdog-lua-deps
  • Docs: operator deployment, getting started, staging drills

2. Release packaging — aligned artifact versions

Release tag vX is the bundle version for sequencer binaries, CM image tarballs, and watchdog.

  • Single pin source: release/versions.env (Rust, xgenext2fs, cartesi-machine, lua-curl)
  • CI/release: .github/actions/load-release-versions; scripts/verify-release-versions.sh in CI
  • Manifest: scripts/generate-release-manifest.shrelease-manifest-vX.json on GitHub Release
  • Watchdog image: watchdog/Dockerfile + docker save tarballs per arch; /opt/watchdog/RELEASE.json + OCI labels
  • Vendored lua-curl: full sources under watchdog/third_party/lua-curl/ (pinned via UPSTREAM / versions.env)

CARTESI_MACHINE_VERSION in versions.env must match the emulator inside the watchdog image and the one used to build canonical-machine-image-* tarballs. See release/README.md.

Historical note

Early commits introduced GET /get_state; later commits pivoted to /finalized_state + SSZ snapshot API (current design).

Test plan

  • lua watchdog/tests/run.lua
  • just test-watchdog-divergence-drill (exit 2 + watchdog_event)
  • cargo run -p rollups-e2e --bin rollups-e2e -- watchdog_genesis_compare_test --exact --nocapture
  • just test-rollups-e2e (includes deposit_transfer_withdrawal_test non-genesis compare)
  • bash scripts/verify-release-versions.sh
  • Staging compare daemon on Sepolia (operator drill 3)

Follow-ups (out of scope)

  • Enderson sign-off on exit-code / watchdog_event alerting contract
  • Port harness off machine_cli once in-process cartesi binding matches CLI archive format

@stephenctw stephenctw self-assigned this May 11, 2026
@stephenctw stephenctw force-pushed the feature/watch-dog branch from e424146 to f1796d2 Compare May 11, 2026 14:31
@stephenctw stephenctw force-pushed the feature/watch-dog branch 2 times, most recently from bb4bbc2 to e609033 Compare May 21, 2026 14:43
@stephenctw stephenctw marked this pull request as draft May 21, 2026 15:03
@stephenctw stephenctw force-pushed the feature/watch-dog branch 3 times, most recently from 45c89a5 to 0847b8c Compare May 22, 2026 14:33
@stephenctw stephenctw requested a review from GCdePaula May 24, 2026 11:30
@stephenctw stephenctw marked this pull request as ready for review May 24, 2026 11:30
Wire sequencer safe-only state export, canonical CM inspect, and Lua
compare-mode E2E against Anvil devnet. Adds staging drill docs and
harness fixes (sequencer-devnet resolution, optional faketime).
@stephenctw stephenctw force-pushed the feature/watch-dog branch 2 times, most recently from a182532 to 71e7865 Compare June 3, 2026 09:45
Replace GET /get_state with finalized snapshot routes for operator compare.
Unify wallet SSZ encoding across sequencer dumps, CM inspect, and watchdog;
share L1 partition test vector with Rust. Restructure Lua modules (lcurl
binding, machine_cartesi, sequencer_reader), add devnet-stack helper and
production-like operator runbooks.
@stephenctw stephenctw force-pushed the feature/watch-dog branch 4 times, most recently from 06fa4de to 73f7c19 Compare June 8, 2026 12:40
@stephenctw stephenctw changed the title Implement watch dog Implement watch dog + release packaging Jun 8, 2026
@stephenctw stephenctw changed the title Implement watch dog + release packaging Watchdog v1 (compare mode) + release bundle versioning Jun 8, 2026
… e2e

Replace webhook alarms with watchdog_event JSON and exit codes 0/1/2.
Run genesis and non-genesis compare in rollups-e2e; build lcurl.so in CI
before the harness.
Track pinned Lua-cURLv3 under watchdog/third_party/lua-curl so CI and
watchdog-lua-deps can build lcurl.so without a network fetch.
Centralize toolchain pins in release/versions.env; load them in CI and
release workflows. Add release manifest generation, version verification,
watchdog Docker image build, and docker-save tarballs per arch.
@stephenctw stephenctw force-pushed the feature/watch-dog branch from 73f7c19 to 7eb4251 Compare June 8, 2026 12:49
The staging drill intentionally exits like production on mismatch; wrap
the just target so local smoke runs succeed while direct invocation still
returns 2 for operators.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant