test(DON'T MERGE): develop-v2.0.0-rc.2 by shuklaayush · Pull Request #2846 · openvm-org/openvm

shuklaayush · 2026-06-05T17:41:54Z

No description provided.

resolves int-6773

I hope this works, but the idea is that stuff merged to develop-v2.0.0-beta will then rebase develop-v2.0.0-rc.1 On main, we do not include develop-v2.0.0-rc.1, so merge to main should set a chain of rebases main -> develop-v2.0.0-beta -> develop-v2.0.0-rc.1

Resolves INT-6673. When any dimension has size of 1, the T6 odometer carry constraint degenerates because "wrap" and "stay" produce the same diff (0), breaking the completeness guarantee of lexicographic enumeration. Add an assert in the Chip constructor to reject this configuration, and a test to verify it panics. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

This resolves INT-6777

This resolves INT-6774. Eq3bAir now maintains `n_logup` as well, constrains that it's constant across the AIR as well as `n_lift` and receives them one per AIR.

## Summary - Removed `openvm/` prefix from source paths across 8 documentation files (21 occurrences) - Updated stale `v2-proof-system` repository reference in `docs/crates/recursion/README.md` to reflect the current openvm repo - Updated `stark-backend` path in `docs/crates/recursion/verifier-mapping.md` to link to the [stark-backend GitHub repo](https://github.com/openvm-org/stark-backend) Fixes issues identified in #2553 (comment) ## Test plan - [x] Verified no remaining `openvm/crates/recursion/src/` prefixes exist - [x] Verified no remaining `v2-proof-system` references exist - [x] Verified no remaining `stark-backend/crates/` local path references exist 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Jonathan Wang <jonathanpwang@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

This resolves INT-6814

This resolves INT-6828. Now `Eq3bAir` propagates the `row_idx` by the correct amount for each AIR. Note that there is no need for this air to know `air_idx` or anything other than the number of interactions and the `n_lift` for this air, so no new columns are added (but we now do the interaction with proof shape air on the last row of the air instead of the first one).

…2563) ## Summary - Add a `changes` detection job using `dorny/paths-filter@v3` to skip the `lint-cuda` job when no CUDA-related files (`*.cu`, `*.cuh`, `**/cuda/**`, `**/cuda*.rs`, etc.) are modified - Request at least 8 CPUs on the GPU runner (`/cpu=8`) for faster builds - The workflow file and CUDA cache action are also included as triggers so changes to CI itself still run the CUDA lint ## Test plan - [ ] Open a PR that doesn't touch any CUDA files and verify `lint-cuda` is skipped - [ ] Open a PR that touches a `.cu` or `.cuh` file and verify `lint-cuda` runs --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…height (#2564)

#2570)

)

…len (#2579)

…2580)

- rename `prove_unwrapped` to `prove_root` - move the looping logic to root prover

`state` is a byte sub-slice of a record and is not guaranteed to be 8-byte aligned, so we cannot reinterpret it in place as `&mut [u64; 8]`. Copy through an aligned buffer (using native-endian bytes). --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

Fixes INT-5507 Fixes INT-5505 Fixes INT-5506 ## Summary Reject public-values accesses past the configured `num_public_values` limit instead of silently accepting them. Neither execution path rejected this before. Non-AOT now gets the expected bounds check because `MmapMemory` exposes the configured size through `size()` and slices, rather than the page-rounded mmap length. AOT public-values load/store instructions now fall back to the normal executor path, so they use the same checked memory access instead of emitting unchecked x86 memory access. ## Tests Adds regressions for direct public-values memory writes and `reveal` past the configured public-values limit.

Resolves a batch of TODOs from the rc.2 TODO audit, one commit per ticket. ## Summary - **Stale TODO cleanups**: dropped the boundary-memory-image TODO in `online.rs` (per discussion, the type change touches too much) and the stale error-handling TODO in `branch_eq` AOT execution (the flagged path already returns `Err(AotError::InvalidInstruction)`). - **`sw_declare!` docs**: replaced the `[TODO]` placeholder with a real `Secp256k1Point` example. - **API rename**: `get_*_step` → `get_*_executor` for the six algebra/ecc constructor functions, matching the `*Executor` types they return. - **SDK examples**: re-enabled `sdk_app`, `sdk_stark`, and `sdk_evm` (gated on `evm-verify`) — the sources were already ported to the v2 API, only the `[[example]]` wiring was stale. - **Test util consolidation**: extracted `assert_vm_states_equivalent` (pc + Merkle-root memory equality) into `openvm_circuit::arch::testing` and replaced all six hand-rolled copies (jalr/mul/mulh tests, `check_aot_equivalence`, riscv test vectors, sha2/keccak256 guest-lib tests). - **Docs**: removed the stale `guest-libs/ruint` entry from `layout.md`, updated README links from deprecated `book.openvm.dev` to `docs.openvm.dev/book`, and fixed a broken rustdoc intra-doc link in `docs/crates/vm.md`. Resolves INT-8171 Resolves INT-8181 Resolves INT-8190 Resolves INT-8192 Resolves INT-8195 Resolves INT-8196 Resolves INT-8236 Resolves INT-8238 Resolves INT-8239 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

## Summary - Allocate the system Poseidon2 GPU record scratch buffer per segment from exact memory tracegen counts instead of `max_trace_height`. - Drop the scratch buffer before Poseidon2 trace allocation and guard release builds against OOB writes. - Save GPU memory by decoupling this scratch allocation from `max_segment_length` for memory-bound segments. ## Memory Impact With `max_segment_length = 2^24`, the standalone reth benchmarks on block `23992138`, `g7e.2xlarge`, `prove-stark` show the sampled whole-process GPU peak dropping by 1.40 GB while keeping the same 85 memory-triggered segments: | Run | Commit | Segments | Peak GPU memory (`nvidia-smi`) | | --- | --- | ---: | ---: | | [Before](https://github.com/axiom-crypto/openvm-eth/actions/runs/27349752147) | `d519aa6` | 85 | 17.30 GB | | [After](https://github.com/axiom-crypto/openvm-eth/actions/runs/27349786356) | `d3a30b6` | 85 | 15.90 GB | | Savings | | | -1.40 GB | The same runs show the expected drop in OpenVM tracked GPU allocation peaks from removing the persistent `max_trace_height`-sized Poseidon2 scratch buffer: | Module | Before | After | Delta | | --- | ---: | ---: | ---: | | `generate mem proving ctxs` | 7.09 GB | 5.20 GB | -1.89 GB | | `set initial memory` | 6.87 GB | 4.87 GB | -2.00 GB | | `prover.rs_code_matrix` | 10.69 GB | 8.70 GB | -1.99 GB | | `prover.batch_constraints.before_round0` | 16.51 GB | 14.51 GB | -2.00 GB | A separate [openvm-eth comparison run 27329017888](https://github.com/axiom-crypto/openvm-eth/actions/runs/27329017888) with a smaller resolved segment length also shows the affected tracked peaks dropping: | Module | Before | After | Delta | | --- | ---: | ---: | ---: | | `generate mem proving ctxs` | 5.59 GB | 5.20 GB | -0.39 GB | | `set initial memory` | 5.37 GB | 4.87 GB | -0.50 GB | | `prover.rs_code_matrix` | 9.19 GB | 8.70 GB | -0.49 GB | | `prover.batch_constraints.before_round0` | 14.70 GB | 14.25 GB | -0.45 GB | ## Testing - `cargo build --profile fast -p openvm-circuit` - `cargo +nightly fmt --all -- --check` - `cargo build --profile fast -p openvm-circuit --features cuda` - `cargo nextest run --cargo-profile=fast -p openvm-circuit --features cuda test_empty_touched_memory_uses_full_chunk_values test_touched_memory_updates_memory_address_space test_cuda_merkle_tree_cpu_gpu_root_equivalence` Resolves int-8291

depends on #2875 resolves int-8290

… bigint examples (#2831) ## Problem The `#[cfg(not(target_os = "zkvm"))]` fallback branches in three extension examples contain only placeholder comments instead of `unimplemented!()`, causing a compile error on non-zkvm targets. ## Changes keccak.mdx: `// Regular Keccak-256 implementation` → `unimplemented!("native keccak256 is only available on zkvm target")` sha-256.mdx: `// Regular SHA-256 implementation` → `unimplemented!("native sha256 is only available on zkvm target")` big-integer.mdx: `// Regular wrapping add implementation` → `unimplemented!("native bigint ops are only available on zkvm target")` ## Test plan Documentation-only change; no library code paths affected. Fixes #2830

## Summary - Derive test `SystemParams` from `segmentation_limits.max_trace_height` instead of the old hard-coded `2^22` cap. - Keep the max trace-height power-of-two invariant at the `SegmentationLimits` config boundary. - Verified the SHA2 CUDA guest-lib proving tests pass with the new `2^24` default. ## Testing - `cargo +nightly fmt --all -- --check` - `cargo build --profile fast -p openvm-circuit` - `CUDA_OPT_LEVEL=3 OPENVM_SKIP_DEBUG=1 cargo nextest run --cargo-profile=fast --features=cuda --run-ignored=all --no-tests=pass --test-threads=1` in `guest-libs/sha2` on `ayush-gpu`

Resolves: INT-8261 --------- Co-authored-by: Allan Lin <allanl@intrinsictech.xyz>

## Summary - Remove configurable segmentation trace-height and interaction-limit knobs. - Derive segmentation trace-height limits from the engine stacked height and interaction limits from the proving field order. - Keep only segmentation max memory configurable through `SystemConfig` / CLI. - Remove stale max segment height plumbing from benchmarks, workflows, and tests. - Add sccache startup-timeout configuration to reduce CI server startup races. ## Testing - `cargo +nightly fmt --all -- --check` - `cargo clippy --profile fast -p openvm-circuit --all-targets --tests -- -D warnings` - `cargo check --profile fast -p cargo-openvm` - `cargo check --profile fast --no-default-features -p openvm-benchmarks-prove --bin keccak_par --features metrics,parallel,jemalloc` - `cargo clippy --profile fast --no-default-features -p openvm-benchmarks-prove --bin keccak_par --features metrics,parallel,jemalloc -- -D warnings` - `python3 -m json.tool ci/benchmark-config.json >/dev/null && python3 -m json.tool ci/benchmark-config.example.json >/dev/null && bash -n ci/scripts/utils.sh && python3 -m py_compile ci/scripts/bench.py` - `actionlint -ignore 'label ".*" is unknown' -ignore '"github.head_ref" is potentially untrusted' -ignore 'object, array, and null values should not be evaluated' .github/workflows/*.yml` - `ruby -e 'require "yaml"; YAML.load_file(".github/actions/sccache/action.yml"); puts "ok"'`

This resolves INT-8372. --- ## Overview This PR changes the `EqNegBaseRandBus` interface so `EqBaseAir` sends the sampled sumcheck challenge `r_0` directly to `EqNegAir`, instead of sending only `r_0^2`. `EqBaseAir` already binds `local.r_pow` to `ConstraintSumcheckRandomness { idx: 0, challenge: r_0 }`. With this change, `EqNegAir` receives that exact value over the permutation bus. This removes the previous square-root sign ambiguity where `EqNegAir` could satisfy the bus receive with `-r_0` because `(-r_0)^2 == r_0^2`. No trace columns are added or removed. The `EqNegBaseRandMessage` width is unchanged: one extension-field element for `u_0` and one extension-field element for the random challenge value. ## Review Map | File | Change | | --- | --- | | `crates/recursion/src/bus.rs` | Renames `EqNegBaseRandMessage::r_squared` to `r` and updates the comment from `r_0^2` to `r_0`. | | `crates/recursion/src/stacking/eq_base/air.rs` | Sends `local.r_pow` directly on `EqNegBaseRandBus` after receiving it from `ConstraintSumcheckRandomnessBus`. | | `crates/recursion/src/batch_constraint/eq_airs/eq_neg/air.rs` | Receives `local.r_pow` directly from `EqNegBaseRandBus` instead of receiving a value constrained to `local.r_pow * local.r_pow`. | ## Bus Interaction Change Before this PR, `EqBaseAir` and `EqNegAir` agreed only on `r_0^2`: ```text ConstraintSumcheckRandomnessBus -> EqBaseAir receives (proof_idx, idx = 0, challenge = r_0) EqBaseAir -> EqNegBaseRandBus sends (proof_idx, u_0, r_0^2) EqNegAir <- EqNegBaseRandBus receives (proof_idx, u_0, local.r_pow^2) ``` After this PR, the bus carries the exact sampled challenge: ```text ConstraintSumcheckRandomnessBus -> EqBaseAir receives (proof_idx, idx = 0, challenge = r_0) EqBaseAir -> EqNegBaseRandBus sends (proof_idx, u_0, r_0) EqNegAir <- EqNegBaseRandBus receives (proof_idx, u_0, local.r_pow) ``` This means a prover can no longer choose `local.r_pow = -r_0` in `EqNegAir` while still matching the base randomness message, except in the degenerate case `r_0 = 0`. ## AIR Changes ### `EqBaseAir` `EqBaseAir` still receives `r_0` from `ConstraintSumcheckRandomnessBus` on the first row: ```text (proof_idx, idx = 0, challenge = local.r_pow) ``` The changed part is the message sent to `EqNegAir`: ```text EqNegBaseRandMessage { u: local.u_pow, r: local.r_pow, } ``` The surrounding running-product constraints for `r_pow`, `r_omega_pow`, `eq_0(u, r)`, and negative-dimension lookup inputs are unchanged. ### `EqNegAir` `EqNegAir` now receives the exact first-row `r_0` value: ```text EqNegBaseRandMessage { u: local.u_pow, r: local.r_pow, } ``` The existing constraints still derive `r_omega_pow = r_pow * omega` on the first row and still use squaring recurrences for later rows and later negative hypercubes. Those internal squaring recurrences remain valid, but the initial sign is now fixed by the bus message from `EqBaseAir`. ### Columns No AIR columns changed, so there is no new trace shape to review. ## Validation Ran: ```sh cargo build --profile fast -p openvm-recursion-circuit ``` Result: passed.

…otal interactions (#2904) This resolves INT-8426. An absent AIR no longer contributes to calculation of the total number of interactions. --- ## Overview This PR fixes `ProofShapeAir` so absent AIR rows cannot contribute to the accumulated interaction count used to derive `n_logup`. `ProofShapeAir` computes: ```text total_interactions = sum(num_interactions_per_row * lifted_height) ``` For present rows, `lifted_height = max(height, 2^l_skip)`. For absent rows, `height` and `log_height` are constrained to zero, and this PR makes `lifted_height` stay zero regardless of the unconstrained absent-row value of `n_sign_bit`. The diff is intentionally narrow: one expression in `crates/recursion/src/proof_shape/proof_shape/air.rs` changes from selecting the minimum lifted height based only on `n_sign_bit` to selecting it based on `n_sign_bit && is_present`. ## Review Map | File | Change | | --- | --- | | `crates/recursion/src/proof_shape/proof_shape/air.rs` | Gates the `2^l_skip` lifted-height branch by `local.is_present`, so absent rows use `local.height`, which is already constrained to zero. | ## `ProofShapeAir` Change Before: ```rust let lifted_height = select( local.n_sign_bit, AB::F::from_usize(1 << self.l_skip), local.height, ); ``` After: ```rust let lifted_height = select( and(local.n_sign_bit, local.is_present), AB::F::from_usize(1 << self.l_skip), local.height, ); ``` The existing absent-row constraints remain unchanged: ```text if !is_present && is_valid: height = 0 log_height = 0 ``` This means an absent row with `n_sign_bit = 1` no longer forces `lifted_height = 2^l_skip`; it now uses `height = 0`. ## Existing Column Behavior No columns are added or removed. The table below illustrates the affected existing columns for valid non-summary rows: | Case | `is_present` | `height` | `log_height` | `n_sign_bit` | `lifted_height` after this PR | Interaction contribution | | --- | ---: | ---: | ---: | ---: | ---: | ---: | | Present, `log_height < l_skip` | 1 | `2^log_height` | `< l_skip` | 1 | `2^l_skip` | `num_interactions_per_row * 2^l_skip` | | Present, `log_height >= l_skip` | 1 | `2^log_height` | `>= l_skip` | 0 | `2^log_height` | `num_interactions_per_row * 2^log_height` | | Absent, benign sign bit | 0 | 0 | 0 | 0 | 0 | 0 | | Absent, malicious sign bit | 0 | 0 | 0 | 1 | 0 | 0 | The last row is the fixed case. Previously, it produced `lifted_height = 2^l_skip` and could inflate `total_interactions`. ## Interaction Accumulation Flow There are no bus interface changes in this PR. The affected internal flow is: ```text ProofShapeAir row existing columns: is_present, height, log_height, n_sign_bit lifted_height = is_present && n_sign_bit ? 2^l_skip : height lifted_height_limbs constrained to decompose lifted_height num_interactions_limbs constrained to lifted_height * selected VK num_interactions_per_row next.total_interactions_limbs constrained to local.total_interactions_limbs + num_interactions_limbs summary row derives n_logup from total_interactions_limbs sends n_logup/n_max to GKR and batch-constraint modules ``` Because absent rows now have `lifted_height = 0`, their `num_interactions_limbs` are forced to zero and the running total matches the reference verifier's present-trace-only sum. ## Validation Ran: ```sh cargo build --profile fast -p openvm-recursion-circuit ``` Result: passed.

- Remove unnecessary swaps that obscure the logic (esp. for sub) - Clean up comments - Don't instantiate BabyBearWire until we have checked bits

## Summary - Add trace-gen observability spans around record arena allocation, initial memory transport, and RV32IM CUDA record H2D enqueue points. - Emit per-AIR `trace_gen.record_arena_bytes` only for arenas that can report real byte counts. - Surface `record_arena.alloc_time_ms` in `openvm-prof` aggregate output. ## Testing - `cargo +nightly fmt --all -- --check` - `cargo check --profile fast -p openvm-circuit --features metrics` - `cargo check --profile fast -p openvm-rv32im-circuit`

github-actions · 2026-06-19T19:36:35Z

group	app.proof_time_ms	app.cycles	leaf.proof_time_ms
fibonacci	3,076	12,000,265	(-3813 [-85.0%]) 673
keccak	16,391	18,655,329	3,048
sha2_bench	9,298	14,793,960	1,140
regex	1,181	4,137,067	(-11639 [-97.0%]) 358
ecrecover	606	123,583	(-5572 [-95.2%]) 284
pairing	948	1,745,757	(-6079 [-95.3%]) 301
kitchen_sink	4,085	2,579,903	875
fibonacci_e2e	1,388	12,000,265	286
regex_e2e	643	4,137,067	165
ecrecover_e2e	390	123,583	142
pairing_e2e	526	1,745,757	149
kitchen_sink_e2e	1,992	2,579,903	383

Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights.

Commit: 79ab492

Benchmark Workflow

resolves INT-7126 ## Summary - Relabel terminate-only dummy proofs used during root/Halo2 keygen as keygen groups so prove-evm reports do not get an extra one-instruction `app_proof` group. - Stop emitting and reporting `record_arena.alloc_time_ms`. - Hide keygen groups and skip empty groups in generated detailed markdown tables. ## Notes - Comparing the linked prove-evm/prove-stark runs, the dummy proof pollution is limited to the tiny `app_proof` group. Aggregation counts line up with the real proof pipeline (`leaf` 22 and `internal_for_leaf` 8 in both runs). ## Testing - `cargo +nightly fmt --all -- --check` - `cargo check --profile fast -p openvm-sdk --features evm-prove,metrics` - `cargo check --profile fast -p openvm-prof`

zlangley and others added 30 commits May 18, 2026 11:42

fix: WhirRoundAir missing constraints (#2540)

b6f13af

fix: constrain proof_idx to start from 0 (#2543)

dc20cfa

resolves int-6773

fix: various proof shape + stacking + transcript fixes (#2542)

ed42400

docs: update stale NestedForLoopSubAir docs (#2539)

fb6c505

fix: zero boundary condition in ConstraintsFoldingAir (#2546)

3dbae02

This resolves INT-6777

fix: make Eq3bAir receive n_logup and n_lift (#2547)

115a956

This resolves INT-6774. Eq3bAir now maintains `n_logup` as well, constrains that it's constant across the AIR as well as `n_lift` and receives them one per AIR.

chore: Remove the now redundant DagCommitBus (#2551)

27ad873

docs: migrate recursion docs to OpenVM (#2553)

c765f75

fix: constrain is_last in ProofShapeAir (#2557)

25a9a80

feat: verify-stark guest library and SDK integration (#2555)

275397b

fix: Make it impossible for a row to be first and second (#2556)

fc79f34

This resolves INT-6814

fix: assert there is one row when flag in {0, 2} (#2561)

e76003f

fix: deduplicate MerkleTreeSubAir and UserPvsCommitSubAir (#2559)

fddc85a

fix: constrain row_idx_flags to row_idx in UserPvsCommitAir (#2562)

400c0b1

fix: constrain is_valid start value in DeferralCircuitCountAir (#2567)

2346723

fix: constrain initial + final merkle proofs are equal up to address_…

6e82759

…height (#2564)

fix: add num_def_circuit_proofs to DeferralAggregationPvs (#2571)

c91f796

fix: final deferral input commit should be hash slice of commits (#2568)

e250585

fix: verify-stark guest lib findings (#2569)

a7cd790

docs: explain the correspondence between verifier circuit and verify() (

ef04f05

#2570)

feat: deferral support for SDK root prover (#2573)

26fcf67

fix: deferral output_commit should be a true hash (#2574)

0861cb4

fix: constrain F values in rv32 memory are canonically decomposed (#2576

c4c9ba5

)

fix: range check shifted top byte of deferral heap pointers + output_…

5cd0377

…len (#2579)

chore: differentiate def_hook_cached_commit and def_hook_vk_commit (#…

ecbb48e

…2580)

gdmlcjs and others added 2 commits June 11, 2026 11:01

chore: cleanup safety in keccak guest library (#2873)

3a85a1b

chore: some sdk refactor for distributed use case (#2871)

3784813

- rename `prove_unwrapped` to `prove_root` - move the looping logic to root prover