Skip to content

refactor: remove unused u32 functions#2916

Draft
mansur20478 wants to merge 131 commits into
develop-v2.1.0-rv64from
refactor/rvr-remove-unused-code
Draft

refactor: remove unused u32 functions#2916
mansur20478 wants to merge 131 commits into
develop-v2.1.0-rv64from
refactor/rvr-remove-unused-code

Conversation

@mansur20478

Copy link
Copy Markdown

remove no longer relevant u32 ranged reads/writes and related functions. RV64 RVR writes ranges in 8 byte granularity, not in 4 bytes.

shuklaayush and others added 30 commits June 19, 2026 12:24
makes the basic framework for using rvr extensions in openvm

- adds an rvr feature flag
- defines `RvrExtensionCtx` struct to provide mappings between
opcode/executor/air indices
- defines `VmRvrExtension` trait that extensions can implement to be
registered
- updated macro so that rvr `ExtensionRegistry` can be auto-generated in
`SdkVmConfig`

closes INT-7474, INT-7475, INT-7479
#2730)

Moves the rvr files related to compiling and execution into
openvm-circuit. Those rvr files previously depended on openvm-circuit
and in order to enable rvr execution through the openvm pipeline, they
had to be made a part of openvm-circuit to prevent circular
dependencies.

closes INT-7537
- Vm execution instance is made to use rvr execution, depending on the
feature. Helper functions to convert between the existing `VmState` and
the rvr state are also added.
- The `VmConfig` macro now has a `create_rvr_extensions` method
implementation, but instead of defining a new `VmRvrConfig` trait, the
`create_rvr_extensions` method piggybacks on the existing
`VmExecutionConfig`. This is to avoid complex feature-gated trait
bounds.

closes INT-6810, INT-7476
Enables running benchmarks through rvr extension. Benchmark tests do not check execution correctness and currently execution involving extensions other than RV32IM fail because `VmRvrExtension` trait implementation is not properly wired.

closes INT-7480
Previously rvr execution had to use `executor_idx_to_air_idx` information in order to construct `ExtensionRegistry`. This was a problem for pure execution which didn't need air indices so the interface diverged between rvr and aot/interpreted. Now for rvr pure execution, dummy index values of `NO_CHIP` are used instead to keep the interface consistent.

towards INT-7611
Removes the rvr tests and instead adds rvr comparison steps in existing openvm tests in a similar way to aot. Unlike aot, metered cost execution is also run and compared for rvr and interpreted modes.

closes INT-7627
- Introduces a new `Rv32IoExtension` in rvr that handles the rv32io
instructions (hint_storew, hint_buffer, reveal). This is mainly to have
a struct managing the hint_store chip index.
- Adds rvr tests to the CI file in the same way as aot.

closes INT-7466
Implements the rvr feature for the keccak256 extension and also adds rvr tests to CI. Now extensions don't take a `staticlib_path` argument manually and instead uses the auto-built staticlib made by a build.rs file.

closes INT-7468
Implements the rvr feature for the Algebra extension. The rvr side of the Algebra extension is now also split into `ModularRvrExtension` and `Fp2RvrExtension`. A notable change is to have the C code for the Algebra extension which uses `libsecp256k1` to also unconditionally contain the C code needed in the ECC extension, since they are closely related and doing so would avoid configuration dependencies.

closes INT-7470, INT-7704
Implements the rvr feature for all extensions that are left - BigInt, Sha2, ECC, Pairing, Deferral. Code for tests and CI are also updated. Changes for the Deferral extension includes additions to the VM state used in rvr execution.

closes INT-7465
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Changed rvr execution to use the existing openvm `VmState` instead of
defining a new state struct and copying data between the two forms. The
references and pointers to the fields in `VmState` are passed to the rvr
execution functions so they can be used in C code.
- The Deferral extension now uses a callback registration system to
expose the Deferral related data to C code instead of piggybacking on
the same mechanism of `OpenVmHostCallbacks`. This is enabled for each
extension so that they can have per-extension data and state that is
maintained separately.

closes INT-7572
- Makes rvr metered execution use the existing `SegmentationCtx` of openvm instead of its own new structs and code. This fixes the discrepancy between rvr and interpreted/aot segmentation logic and resolves the issue of rvr making too many segments. (https://github.com/axiom-crypto/openvm-eth/actions/runs/26112543464)
- Fixes the calculation of `num_insns` in segments that are used as segment boundaries. The instruction counts were recorded as multiples of `segment_check_insns` (1000) that didn't map to the actual basic block boundaries. Addresses the problem of overflowing GPU memory. (https://github.com/axiom-crypto/openvm-eth/actions/runs/26127167098)

closes INT-7835
moves the rvr compilation stage for metered and metered cost out of execution and into instance construction

closes INT-7626
Air indices are now represented as an enum in rvr code. The `AirIndex`
enum has `Uninitialized` and `NoChip` variants that replace the previous
`NO_CHIP = u32::MAX`. `AirIndex::Uninitialized` is only used in pure
execution where air indices don't matter and causes a panic in rvr
metered and metered cost execution.

closes INT-7611
…#2807)

- **RVR metered execution can now suspend at segment boundaries.**
Previously only the interpreter and AOT backends supported
segment-by-segment metered runs; RVR ran metered execution straight to
termination. This branch adds a parallel `RvrMeteredSegmentInstance`
(`RvrMeteredInstanceWith<F, SegmentBoundary>`) whose
`execute_metered_until_segment_boundary` returns after the metered
segmentation callback creates a segment, mirroring the suspend/resume
shape the other two backends already expose. The tracer countdown is
carried across calls by checkpointing `tracer.check_counter` into
`segmentation_ctx.instrets_until_check` on suspend and restoring it on
entry; both values are `try_from`-validated against u32 at the entry
point (new `ExecuteError::InvalidMeteredContext`) and the hot
C-callback's matching cast is guarded by `debug_assert_eq!`. Mid-segment
suspension is out of scope: `initialize_segment_memory` resets the
per-segment page-indices checkpoint buffer assuming the page buffers
have already been flushed at a segment boundary.

- **Generated-C surface reorganized by policy.** Block-begin and
suspender helpers move into
`c/block/{instret,metered,metered_segment}.h` and
`c/suspender/{none,instret_limit,segment_boundary}.h`; tracer headers
move under `c/tracer/`. A new `SuspendPolicy` enum drives which pair is
included, with `compile_impl` rejecting incoherent combinations
(`Metered` × `InstretLimit`, `Pure|MeteredCost` × `SegmentBoundary`) at
compile time. Compile-time selection without preprocessor directives in
the generated C, per the AGENTS.md guidance.

- **`MeteredCtx` round-trip via `MeteredCtxParts`.** `SegmentationState`
now carries the full `MemoryCtx` and `suspend_on_segment` flag, so a
suspended metered run can be converted back to a `MeteredCtx`
(`into_metered_ctx`) and resumed without losing page-tracking or
segmentation state. A new test exercises the field-by-field round-trip.

- **All RVR codegen inputs embedded at compile time.** Removes every
`CARGO_MANIFEST_DIR` runtime dependency from the RVR project-emit
pipeline so binaries (Docker images, etc.) no longer need the source
tree to invoke `compile_impl`. Core C files (`openvm_io.{c,h}`,
`rvr_ext_wrappers.c`) switch from `fs::copy` to
`fs::write(include_str!(…))`. Extension `.a` staticlibs migrate from
`staticlib_path() -> &Path` to `staticlib_file() -> (&'static str,
&'static [u8])` via `include_bytes!(env!("RVR_*_FFI_STATICLIB"))`, with
a new `write_extension_staticlibs` helper writing them to the temp
project for `make` to link. Modular's libsecp256k1 amalgamation include
(~85 `.c`/`.h` files, with test/bench/ctime/valgrind files filtered out)
is collected by `extensions/algebra/rvr/build.rs` into a generated
`SECP256K1_C_FILES` const and returned via the new
`RvrExtension::extra_c_include_files()` hook (for files written but not
compiled as their own TUs); `extra_cflags` switches to relative
`-Isecp256k1/src` / `-Isecp256k1` against the temp project root. Trait
return types are tightened from `&str` to `&'static str`.

- **Up-front toolchain detection.** `compile_impl` probes the C
compiler, linker, and `make` in PATH before building and reports all
missing tools at once via `RuntimeToolchainError`. Adds `RVR_MAKE`
override, forwards `HOST_OS` to the Makefile (replacing its `uname -s`
shell-out), and threads path context into `CompileError` I/O variants.

- **Metrics consolidation.** The four near-identical `Instant::now() …
counter!().absolute() … gauge!().set()` blocks across interpreter / AOT
/ RVR are replaced by a single `ExecutionMetricTimer` helper in
`arch::execution_metrics` (guarded against div-by-zero on
sub-microsecond runs). A complementary `CompilationTimer`
(`arch::compilation_metrics`) wraps every `*_instance` constructor and
emits a `compile_{pure,metered,metered_cost,metered_segment}_ms` gauge
labeled by backend (`interpreter` / `aot` / `rvr`).

- **E1/E2/E3 jargon dropped.** `execute_e1` span/metric names become
`execute_pure`; `terminate_execute_e12_*` → `terminate_execute_*`; const
generic `E1` → `PURE_EXECUTION`. Comment references to "(E1)/(E2)/(E3)"
are removed in favor of "pure/metered/preflight".

- **Metric names.** `execute_e1_insns` → `execute_pure_insns`,
`execute_e1_insn_mi/s` → `execute_pure_insn_mi/s`. Dashboards or
alerting keyed on the old names need to be updated.
- **`RvrExtension` trait surface.** `extra_c_source_paths() ->
Vec<PathBuf>` → `extra_c_sources() -> Vec<(&'static str, &'static
str)>`; `staticlib_path/paths()` → `staticlib_file/files()` returning
embedded bytes; new optional `extra_c_include_files()` for files written
but not compiled as TUs. Existing impls need a one-time conversion to
`include_str!` / `include_bytes!`.
- **`ExecutorInventory` generic param renames** (`E1`/`E2`/`E3` →
`CombinedE`/`NewE`/`TargetE`) are visible in error messages but
compatible.
- **`CompileError` shape.** `CProject(io::Error)` → `CProject { path,
source }`; `Toolchain(String)` → `Toolchain(#[from]
RuntimeToolchainError)`; new `ToolchainCommand { command, source }`.
Callers matching on these variants need to update.
- **Binary size.** Embedded `.a` staticlibs and libsecp256k1 sources
grow the binary by roughly 5–30 MB depending on enabled extensions.

resolves int-7917

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Splits the `GenericSdk::execute_*` methods into `GenericSdk::compile_*`
and `GenericSdk::execute_compiled_*` methods for pure, metered and
metered cost. This is to be able to reuse the "compiled" instance which,
for rvr, takes a long time to create.

closes INT-7842
Three correctness fixes in the RVR backend so that its per-segment VM
state is byte-identical to the Rust preflight executor's, plus a
test-coverage fix that closes the gap that hid all three bugs from
`check_rvr_equivalence`.

equivalence check`

- `check_rvr_equivalence` (and its AOT sibling) only walked AS=1
(registers) byte-by-byte, silently missing any divergence in RV32 main
memory (AS=2), public values (AS=3), and deferral (AS=4). All three
correctness fixes below live in AS=2 or AS=4 and would have surfaced on
the first `air_test`-style run if the check had walked every address
space.

Extracted the closure into a `check_vm_state_eq(lhs, rhs) ->
eyre::Result<()>` free function shared by both the RVR and AOT
equivalence checks, replacing the AS=1-only loop with a slice-level diff
over every `LinearMemory`. Short-circuits at the first mismatch and
reports `(AS, byte offset, lhs, rhs)`. Microseconds on typical test VM
configs.

- `SegmentationState::on_periodic_check` was bumping
`segmentation_ctx.instret` by a full `segment_check_insns` interval
up-front, then incrementing `tracer.check_counter` by the same delta on
the way out. The anchor and the countdown ended up ahead of the actual
VM by exactly `remaining_counter`, so the next interval inherited an
inconsistent baseline.

In termination paths this could let the segmenter seal a non-terminal
block as the final segment. The callback now:
- computes the actual block-boundary instret directly: `prev_anchor +
(segment_check_insns - remaining_counter)`,
- writes that back as the new anchor,
- resets `check_counter` to a full fresh interval rather than
incrementing.

This matches the Rust metered executor's behavior at the same point.

Mod-builder evaluates `SymbolicExpr` inputs **modulo the configured
prime**. For `SETUP_ADDSUB` / `SETUP_MULDIV` and their Fp2 counterparts,
the compute formula resolves to `Input(0)`, which during setup is the
modulus `p` itself — so the variable is `p % p = 0`. The VM writes 32
zero bytes (64 for Fp2) to `rd`.

`rvr_ext_mod_setup` and `rvr_ext_fp2_setup` were copying `rs1`'s bytes
(the modulus) to `rd`. Those bytes then leaked into the guest's stack as
register-loaded values, propagating downstream as a memory divergence
between RVR and preflight at later segment boundaries.

The FFI now traces the `rs1`/`rs2` reads (still required for chip
metering) but writes zero bytes to `rd`.

The deferral CALL FFI in RVR only traced AS=4 access for metering and
never updated the `(input_acc, output_acc)` accumulator bytes. The Rust
preflight executor (`DeferralCallExecutor::execute_e12_impl`) hashes
each `(old_acc, commit)` pair via poseidon2 and writes the new
accumulator F's to DEFERRAL_AS. Every deferral CALL therefore left RVR's
AS=4 a hash-round behind preflight, producing a memory divergence that
cascaded through subsequent CALLs.

Plumbed a `(*mut F, len_in_F_units)` alias of DEFERRAL_AS through
`OpenVmIoState` (via a new `deferral_memory_ptr` helper in `bridge.rs`
with a debug-mode alignment check on the `u8 → F` cast) and registered a
`DeferralCompressFn` poseidon2 closure on the host side.
`host_deferral_call_lookup` now hashes the accumulators and writes the
new F bytes into AS=4 in F-element units that exactly match preflight's
`vm_write::<F, BLOCK_SIZE>` layout. `F::from_u32` is bijective with the
perm output for `MontyField31`, so the stored bytes are byte-identical
to what the preflight executor writes.

resolves int-7974
Memory read and write functions now have an optional `check_bounds`
invocation before accessing the memory. `check_bounds` checks that the
access lies within the VM's addressable memory region and aborts
otherwise. The same is applied for `openvm_io.h` functions that work
with the user IO address space in data memory.

To turn off protected mode, add the `openvm-cli/unprotected` feature.

Mirrors the interpreter's `check_bounds` and `panic_oob` functions, and
`unprotected` Cargo feature.

closes INT-7702

---------

Co-authored-by: Ayush Shukla <ayush@axiom.xyz>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…2820)

A perf pass on RVR metered execution. The main change isolates the rare
segment-check callback into a cold per-block helper so the hot block
stays frameless.

## Cold per-block segment-check helper

Hot metered RVR blocks were paying a stack-frame cost because the rare
`on_check` callback could fire from the same C function. Generated asm
showed every metered block — even single-instruction ones — getting an
entry prologue and stack spills to preserve guest-register parameters
across the possible callback.

Hot blocks now only test `check_counter < block_insn_count` inline. On
underflow they musttail-jump to a cold per-block
`block_0xPC_checkpoint(...)` helper that runs the segment-check
callback, suspends/exits if needed, or musttails back to the hot block
with a refreshed counter. Same semantics; the hot path is frameless
again.

## Cleanup landing in the same patch

- **`uses_page_tracking()` IR predicate.** Only blocks that can touch
memory emit AS_MEMORY page-tracking locals. ~59% of blocks in the reth
benchmark didn't need them. Extension emitters default to `true`;
`HintNonQrInstr` opts out, plain host-only phantoms (`HintInput`,
`HintRandom`, `PrintStr`) don't trigger page locals.
- **Metered block ABI hoist.** `check_counter` (`_cc`) and
`trace_heights` (`_th`) are passed as block parameters, removing
`state->tracer` loads from every block. Metered mode uses 8 hot guest
registers instead of 10 to fit the new parameters.
- **`CompileOptions::keep_artifacts`.** Retains the generated RVR C
tempdir on success and logs the path. Useful for codegen / asm audits.
- Per-width fast traced memory helpers and clang-format / formatting
cleanups across the FFI C/Rust crates.

results in a modest ~100ms (out of 1.7s) improvement in metered
execution of [reth
benchmark](https://github.com/axiom-crypto/openvm-eth/actions/runs/26514539244)

on my laptop, the improvement is much more significant (~20%)
In `rvr`, some constants are redefined or set as variables. Resolved
some of the dependency issues (e.g. circular dependency) to import the
constants instead.

Related constants:
1) `WORD_SIZE`: imported from `openvm_platform::WORD_SIZE`

2) `AS_MEMORY`: imported from
`openvm_instructions::riscv::RV32_MEMORY_AS`

3) `AS_REGISTER`: imported from
`openvm_instructions::riscv::RV32_REGISTER_AS`

4) `AS_PUBLIC_VALUES`: imported from
`openvm_instructions::PUBLIC_VALUES_AS` (moved to `openvm_instructions`
from `openvm-circuit`. Is it right choice????)

5) `DEFERRAL_AS`: imported from `openvm_instructions::DEFERRAL_AS`

6) `MAX_BLOCK_INSNS` (`rvr-openvm-lift/src/cfg.rs`): was `let`, now
`const`

The following ones are kept redefined:
1) `CHUNK`, `DEFERRAL_DIGEST_SIZE`: logically are from
`openvm-stark-sdk`. `openvm-circuit` and `openvm-recursion-circuit`
already redefine CHUNK. (can not import from them due to circular
dependency).

2) `DEFAULT_PAGE_BITS`, `DEFAULT_SEGMENT_CHECK_INSNS`: logically are
from `openvm-circuit::arch::execution_mode::metered::{ctx,
segment_ctx}`. These are host-side metered-execution defaults. (can not
import from them due to circular dependency)

towards INT-7571
`MAX_MEM_PAGES_PER_INSN ` is a worst-case number of pages a single
instruction can touch. The worst-case unique pages per instruction
(`HINT_BUFFER`) is `MAX_HINT_BUFFER_WORDS * WORD_SIZE` bytes divided by
page size. One page covers `CHUNK * 2^PAGE_BITS` bytes.

So the formula is:
`MAX_MEM_PAGES_PER_INSN = div_ceil(MAX_HINT_BUFFER_WORDS * WORD_SIZE,
CHUNK * 2^PAGE_BITS) + 1`

`+1` misalignment.

closes INT-7462
Add save and load compiled artifacts feature in `rvr` mode. The feature
consist of having the ability to save compilation artifacts on disk and
load them into the sdk to execute (part 1). Reusing of the persisted
artifacts whenever possible instead of recompiling based on some
metadata (part 2) will be done in separate PR.

This PR is related to the part 1. The following methods were added:
1) `Sdk::load_compiled_pure`, `Sdk::load_compiled_metered`,
`Sdk::load_compiled_metered_cost` and related methods for loading pure,
metered and metered cost `.so` files

2) `RvrPureInstance::save`, `RvrMeteredInstance::save`,
`RvrMeteredCostInstance::save` and related methods for saving `.so` file

towards INT-7843
shuklaayush and others added 26 commits June 19, 2026 12:24
The verify-stark deferral path test merged on rc2 constructs an
SdkVmConfig::riscv32, which does not exist on the rv64 branch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rebuild with the rv64 openvm toolchain so the prebuilt elf picks up the
dense user public values change merged on rc2 (#2844).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
RVR extension for RV64 RISCV. The current implementation was tested on
`openvm-riscv-integration-tests` tests.

The RVR extension required following changes on top of `rvr-rv32`:
1) Adding new `W` ALU and Mult/Div instructions, new `lwu` instruction.
2) Updating lifter to lift new instructions.
3) Updating instruction simulation done in block generation phase to
64bit.
4) Updating `C` code emission to emit 64bit instructions (`MULH`
instruction uses `__int128` to avoid overflow of `64bit` multiplication
to get upper 64bit of the result). Adding 64bit and 32bit sign extended
reads in rvr.
5) Updating `RvState` registers to 64bit.
6) Updating `bridge.rs` - the buffer access between `VmState` and
`rvr:ffi`, to 64bit.

towards INT-8107
removes unused `reservation_addr`, `reservation_valid`, `brk`,
`start_brk`, `csrs` fields from RvState.
rvr support for the rv64 bigint extension

`rd/wr_mem_words_traced` in `rvr-openvm-ffi-common` was updated to
operate at OpenVM's native word size (8 bytes, matching
`MEMORY_BLOCK_BYTES`). Previously it used 4-byte (u32) granularity,
which was logically incorrect for rv64. The bigint extension uses the
corrected API.

This change broke compilation for other extensions (sha2, ecc, algebra)
that depended on the old u32 signature. As a temporary fix, those
extensions bypass `rd/wr_mem_words_traced` and call the underlying u32
range wrappers directly. Each call site is marked with a TODO to migrate
properly when rvr support is added for that extension.

The bigint MULT operation follows the interpreter and uses a u32/u64
combination (8 × u32 limbs, accumulating into u64). RISC-V's native
MULW/MUL uses u64/u128. Should we align to the u64/u128 approach, or
keep it consistent with the interpreter?

towards INT-8108
…le tree (#2840)

Motivation: A memory optimization of the GPU merkle tree is needed (once
we go to 2^32 address space size) because currently how the
MemoryMerkleTree works is that if we have N bytes then we will have N/16
leaves (and N/8 nodes in total) and each leaf is stored as 8 field
elements. This means we need 4N bytes of VRAM to store the memory merkle
tree in the GPU. With N = 2^32 (once we go to 2^32 AS size), 4N is
roughy 17 GB which will OOM the GPU (our current limit is ~15 G). 

Optimization idea of this PR: Don't store the last OMITTED_BOTTOM_LEVELS
levels of the MemoryMerkleSubTree in the GPU memory and only computing
it when needed for address spaces with large sizes (AS2 and deferral
AS). This is done by recomputing (i.e. re-computing the poseidon2 hash)
from the raw memory. This saves the GPU memory required for the Merkle
tree itself by a factor of 8.

Reth benchmark results:
https://github.com/axiom-crypto/openvm-eth/actions/runs/26959247863

The summary of the result is proving time increased by 0.77s but the
generate mem proving ctxs went down from 3.84 G to 2.09 G and set
initial memory went down from 3.63 G to 1.88 G. Note that it didn't went
down by a factor of 8 because there are other things that uses the
memory. The other things here include the boundarychipgpu and
initial_memory buffer (see the MemoryInventoryGPU struct for details).

Closes INT-8079
this PR resolves issues revealed during reth benchmarking. One of the
issues was forgetting to update `unprotected` mode related code, such as
`openvm_check_mem_bounds_unprotected.h`
Re-do of PR #2777 (base_alu part only), now on top of the u16 memory-bus
limbs change. Summary of the changes:
- Split base_alu chip into add_sub and xor_or_and chops.
- New xor_or_and chip is the old base_alu minus ADD/SUB.
- New add_sub chip handles the add and sub opcodes and store 2 bytes per
field element in its column.
- This allows us to remove the interactions needed to range check that
each individual field elements is bytes that was present in the previous
base_alu chip.
- Core width of the add_sub chip drops to 14 columns compared to the 29
columns of the base_alu chip.
- Rewrite tests.rs of add_sub chip for the new u16 columns layout.

Improves perf by 6% on the reth benchmark:
https://github.com/axiom-crypto/openvm-eth/actions/runs/27436476879

Closes INT-8102

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Co-authored-by: Ayush Shukla <ayush@axiom.xyz>
## Summary

Clarifies the SDK execution API by separating compile, execute, and
compile-and-execute entry points for pure and metered flows. This
renames the previous compiled-execution helpers to the shorter
`execute*` forms and updates SDK examples, tests, and CLI call sites
accordingly.

## Testing

- `cargo check --profile fast -p openvm-sdk`
- `cargo check --profile fast -p openvm-sdk --tests`
- `cargo check --profile fast -p openvm-sdk --examples`
- `cargo check --profile fast -p cargo-openvm`
- `cargo check --profile fast -p openvm-prof`
rvr support for deferral extension.

The PR required changes in how the rvr deferral extension reads and
write to the memory, from 4 byte granularity to 8 byte. This also meant
that for `output length`, the u64 LE stored right after the commit,
formula got changed to accommodate 8 byte granularity.

Also, `DEFERRAL_COMMIT_NUM_BYTES` had to be updated to
`DEFERRAL_DIGEST_SIZE * F_NUM_BYTE` from `DEFERRAL_DIGEST_SIZE *
WORD_SIZE`. This is because `DEFERRAL_DIGEST_SIZE` is digest size in
field elements, not in `WORDS`.

resolves INT-8113
@mansur20478 mansur20478 self-assigned this Jun 19, 2026
@github-actions

Copy link
Copy Markdown
group app.proof_time_ms app.cycles leaf.proof_time_ms
fibonacci 1,052 4,000,051 395
keccak 16,218 14,365,133 3,008
sha2_bench 8,187 11,167,961 995
regex 1,204 4,090,656 359
ecrecover 433 112,210 274
pairing 605 592,827 298
kitchen_sink 3,894 1,979,971 865

Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights.

Commit: 612457a

Benchmark Workflow

@shuklaayush shuklaayush force-pushed the develop-v2.1.0-rv64 branch 2 times, most recently from 5e8a1fc to 46277f7 Compare June 19, 2026 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants