refactor(stark): unify & clean up the commitment layer by diegokingston · Pull Request #735 · yetanotherco/lambda_vm

diegokingston · 2026-06-29T16:20:28Z

No description provided.

Unifies the two near-identical bit-reversed leaf hashers (per-row + per-row-pair) into one keccak_leaves_bit_reversed_grouped(columns, rows_per_leaf), and the two near-identical commit fns (commit_columns_bit_reversed + commit_composition_polynomial) into commit_bit_reversed(columns, rows_per_leaf), in a new crypto/stark/src/commitment.rs. Removes them from the IsStarkProver trait (they used no self). prover.rs re-exports the named leaf hashers for the math-cuda GPU parity test. Byte-identical (stark 128/128).

Every FRI query opens a value and its symmetric counterpart (LDE positions 2*iota, 2*iota+1). The composition-poly commitment already grouped that pair into one leaf; the trace (main/aux/precomputed) committed one row per leaf and opened TWO leaves (proof + proof_sym) per query. Commit the trace with rows_per_leaf=2 too, so each query opens ONE leaf/path; drop the now-redundant proof_sym from PolynomialOpenings (also shrinks the composition opening, which stored the path twice). verify_opening_pair now reconstructs the paired leaf and verifies once (mirrors verify_composition_poly_opening); the dead verify_opening is removed. Halves trace Merkle authentication-path data per query (smaller proofs + less verifier hashing). stark 128/128 (prove+verify). NOTE: proof FORMAT change (not byte-identical). cuda follow-up: the GPU trace leaf+tree builders (gpu_lde::try_expand_leaf_and_tree_batched_keep/_ext3_keep + math-cuda kernels) still build 1-row leaves and must switch to the row-pair pattern (the GPU composition builder already pairs) or cuda proofs will fail verification.

…ropagation, doc/log fixes

The trace commitment now uses the row-pair leaf layout (ROWS_PER_LEAF=2), same as composition; rows_per_leaf=1 is only kept for the GPU parity test. Update the module/const/wrapper docs that still described the pre-pairing per-row trace.

Extract two helpers on IsStarkProver, collapsing the near-duplicate main-trace and aux-trace commit code: - spill_tree<C>: the identical-except-label disk-spill block, shared by the main / preprocessed-split / aux commit sites (4 call sites). - commit_plain<C>: commit_bit_reversed + spill_tree + TableCommit::plain, shared by the main-trace (non-preprocessed None arm) and aux-trace plain-commit paths. Proof output is byte-identical (stark 128/128, +disk-spill 133/133). The only behavioral delta is in the instruments profiling feature: the aux commit's timing bucket now includes its (tiny) disk spill, matching what the main path already measured. clippy clean on default / disk-spill / instruments; builds on the combined feature set.

The trace Merkle commitment moved to a row-pair leaf layout (ROWS_PER_LEAF=2), but the 5 preprocessed-table commitment computers (bitwise/keccak_rc/page/decode/register) still built a 1-row-per-leaf tree manually, so the prover's row-pair precomputed root no longer matched the computed/hardcoded one -> PrecomputedCommitmentMismatch at prove time. - Route all 5 compute_*commitment fns through the shared stark::commitment::commit_bit_reversed(.., ROWS_PER_LEAF), dropping the manual bit-reverse + columns2rows + 1-row BatchedMerkleTree::build. - Regenerate the hardcoded bitwise/keccak_rc/zero_page static commitments for the row-pair layout (via compute_static_commitments). - cargo fmt (prover.rs + touched tables). Verified: static_commitments drift tests 5/5, stark 128/128, clippy + fmt clean, no PrecomputedCommitmentMismatch. Full ELF prove/verify runs in CI (guest artifacts absent locally).

…layout The compile-time decode-commitment const for sub.elf shifted with the row-pair preprocessed commitment; regenerated via commitment_from_elf (the print_decode_commitment_for_sub regen path).

… from CI

…air value Regenerated from CI's sub.elf (local riscv toolchain unavailable); removed the temporary print instrumentation.

… perf/commitment-row-pair # Conflicts: # crypto/stark/src/prover.rs

…after merge)

… GPU column-LDE dead_code The R2 parts-LDE / comp-poly-tree GPU dispatches no longer fire since #699/#700 route degree-3 tables through the 2-part fused coset_lde_full path (no AIR has number_of_parts > 2). try_expand_columns_batched* are debug-checks-only after the #650 row-major LDE became production.

diegokingston · 2026-06-29T19:31:33Z

/bench

github-actions · 2026-06-29T19:35:09Z

Benchmark — ethrex 20 transfers (median of 3)

_{Table parallelism: auto (cores / 3)}

Metric	main	PR	Δ
Peak heap	82174 MB	72040 MB	-10134 MB (-12.3%) 🟢
Prove time	42.488s	39.775s	-2.713s (-6.4%) 🟢

🎉 Improvement detected — heap or time decreased by more than 5%.

✅ Low variance (time: 2.0%, heap: 1.6%)

_{Commit: 1a166e0 · Baseline: cached · Runner: self-hosted bench}

MauroToscano · 2026-06-29T19:43:04Z

/bench-gpu

github-actions · 2026-06-29T19:43:13Z

GPU Benchmark (ABBA) — `1a166e0343` vs `main` (6 pairs)

_{RTX 5090 · Vast.ai datacenter @ $0.8575/hr · prover/cuda · drift-free A/B/B/A}

❌ Run failed. Last log lines:

| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
==> Refs
   A (PR)       1a166e0343d5dd9a9826c89d9640107585d24bea  -> 1a166e0343
   B (baseline) origin/main  -> 7974e458f2
   pairs=6  (=12 prove runs)
==> Building ethrex guest ELF (missing)
Sysroot already exists at /opt/lambda-vm-sysroot
mkdir -p executor/program_artifacts/rust
cd ./executor/programs/rust/ethrex && \
	CARGO_TARGET_DIR=/workspace/lambda_vm/executor/shared_target \
	CFLAGS_riscv64im_lambda_vm_elf="--target=riscv64 -march=rv64im -mabi=lp64 --sysroot=/opt/lambda-vm-sysroot" \
	rustup run nightly-2026-02-01 cargo build --release \
		--target /workspace/lambda_vm/executor/programs/riscv64im-lambda-vm-elf.json \
		-Z build-std=core,alloc,std,compiler_builtins,panic_abort \
		-Z build-std-features=compiler-builtins-mem \
		-Z json-target-spec
cp ./executor/shared_target/riscv64im-lambda-vm-elf/release/ethrex executor/program_artifacts/rust/ethrex.elf
==> Generating ethrex 20-transfer fixture (missing)
wrote executor/tests/ethrex_bench_20.bin (32766 bytes): block #1 with 20/20 transfer(s) [N senders -> N recipients]
==> Building both prover binaries in isolated worktree /tmp/abba_wt
==> Building cli @ 7974e458f2 -> cli_B  (features: jemalloc-stats,prover/cuda)
    cudarc pinned to cuda-12080
==> Building cli @ 1a166e0343 -> cli_A  (features: jemalloc-stats,prover/cuda)
    cudarc pinned to cuda-12080
==> Running 6 interleaved pairs  (improvement: + = PR faster)

MauroToscano · 2026-06-29T19:47:01Z

/bench-gpu 5

Follow-up to the row-pair commitment PR. Excludes the intentional proof-format break (proof_sym removal / leaf-count change), which is by design. Test coverage (in their own files under crypto/stark/src/tests/, per the crate's test structure): - tests/commitment_tests.rs: direct unit tests pinning the row-pair leaf layout (R=1 and R=2) against an independent reference, wrapper agreement, commit-root consistency, empty-input short-circuit, and a release-mode divisibility-assert test. Previously the leaf layout was only covered transitively via full prove->verify, and GPU parity tests compared against an inline reimpl rather than this module. - tests/row_pair_opening_tests.rs: two negative tests for the row-pair verify_opening_pair — a tampered symmetric trace evaluation and a corrupted Merkle authentication path must both be rejected. Removing proof_sym deleted the old "symmetric opening mismatch" rejection class; these restore it (an impl ignoring evaluations_sym / the auth path would otherwise pass every existing test). Reuses the now pub(crate) make_valid_simple_proof helper. - cuda_path_integration.rs: restore assert!(gpu_comp_poly_tree_calls() > 0). try_build_comp_poly_tree_gpu is dispatched unconditionally (round 2, after the parts-count branch), so it fires for the common number_of_parts == 2 (degree-3) case — it was NOT obsolete. Keep the genuinely-dead parts-LDE assertion dropped. Cleanups: - Delete dead fn commit_plain (zero callers; main/aux commit inline commit_rows_bit_reversed + spill_tree, which are row-major and incompatible with its column-major signature — the dedup never landed). - Delete orphaned pub fn columns2rows (all callers removed by #735). - Promote the commitment.rs precondition checks from debug_assert! to hard assert! — they run once per commit (negligible) and the release failure mode on this pub API is a silently wrong Merkle root. - Add ProvingError::Fft and map FFTError to it instead of WrongParameter (internal FFT failure is not a caller-supplied-parameter error). - Fix stale profiler label commit_composition_poly -> commit_bit_reversed. - Drop the rot-prone "// = 2" comment on the local ROWS_PER_LEAF alias.

Follow-up to the row-pair commitment PR. Excludes the intentional proof-format break (proof_sym removal / leaf-count change), which is by design. Test coverage (in their own files under crypto/stark/src/tests/, per the crate's test structure): - tests/commitment_tests.rs: direct unit tests pinning the row-pair leaf layout (R=1 and R=2) against an independent reference, wrapper agreement, commit-root consistency, and empty-input short-circuit. Previously the leaf layout was only covered transitively via full prove->verify, and GPU parity tests compared against an inline reimpl rather than this module. - tests/row_pair_opening_tests.rs: two negative tests for the row-pair verify_opening_pair — a tampered symmetric trace evaluation and a corrupted Merkle authentication path must both be rejected. Removing proof_sym deleted the old "symmetric opening mismatch" rejection class; these restore it (an impl ignoring evaluations_sym / the auth path would otherwise pass every existing test). Reuses the now pub(crate) make_valid_simple_proof helper. - cuda_path_integration.rs: restore assert!(gpu_comp_poly_tree_calls() > 0). try_build_comp_poly_tree_gpu is dispatched unconditionally (round 2, after the parts-count branch), so it fires for the common number_of_parts == 2 (degree-3) case — it was NOT obsolete. Keep the genuinely-dead parts-LDE assertion dropped. Cleanups: - Delete dead fn commit_plain (zero callers; main/aux commit inline commit_rows_bit_reversed + spill_tree, which are row-major and incompatible with its column-major signature — the dedup never landed). - Delete orphaned pub fn columns2rows (all callers removed by #735). - Add ProvingError::Fft and map FFTError to it instead of WrongParameter (internal FFT failure is not a caller-supplied-parameter error). - Fix stale profiler label commit_composition_poly -> commit_bit_reversed. - Drop the rot-prone "// = 2" comment on the local ROWS_PER_LEAF alias.

…#740) Follow-up to the row-pair commitment PR. Excludes the intentional proof-format break (proof_sym removal / leaf-count change), which is by design. Test coverage (in their own files under crypto/stark/src/tests/, per the crate's test structure): - tests/commitment_tests.rs: direct unit tests pinning the row-pair leaf layout (R=1 and R=2) against an independent reference, wrapper agreement, commit-root consistency, and empty-input short-circuit. Previously the leaf layout was only covered transitively via full prove->verify, and GPU parity tests compared against an inline reimpl rather than this module. - tests/row_pair_opening_tests.rs: two negative tests for the row-pair verify_opening_pair — a tampered symmetric trace evaluation and a corrupted Merkle authentication path must both be rejected. Removing proof_sym deleted the old "symmetric opening mismatch" rejection class; these restore it (an impl ignoring evaluations_sym / the auth path would otherwise pass every existing test). Reuses the now pub(crate) make_valid_simple_proof helper. - cuda_path_integration.rs: restore assert!(gpu_comp_poly_tree_calls() > 0). try_build_comp_poly_tree_gpu is dispatched unconditionally (round 2, after the parts-count branch), so it fires for the common number_of_parts == 2 (degree-3) case — it was NOT obsolete. Keep the genuinely-dead parts-LDE assertion dropped. Cleanups: - Delete dead fn commit_plain (zero callers; main/aux commit inline commit_rows_bit_reversed + spill_tree, which are row-major and incompatible with its column-major signature — the dedup never landed). - Delete orphaned pub fn columns2rows (all callers removed by #735). - Add ProvingError::Fft and map FFTError to it instead of WrongParameter (internal FFT failure is not a caller-supplied-parameter error). - Fix stale profiler label commit_composition_poly -> commit_bit_reversed. - Drop the rot-prone "// = 2" comment on the local ROWS_PER_LEAF alias.

#744) Two items that landed too late for #740: - Move the shared make_valid_simple_proof helper out of small_trace_tests.rs into tests/trace_test_helpers.rs, where the crate keeps shared test helpers (matching how prover_tests sources get_trace_evaluations). row_pair_opening_tests.rs and small_trace_tests.rs now both import it from there instead of one test file reaching sideways into another. - Delete AGENTS.md (added by #735).

Resolves the conflict between #715 ("perf/row-major trace LDE for GPU path") and this PR's row-pair commitment. The two reworked the same GPU trace-LDE/commit functions for orthogonal goals: #715 made the LDE consume a row-major input buffer (memory-layout win, kept a per-row commitment); this PR groups two bit-reversed rows per Merkle leaf (one path per query, proof_sym removed). Resolution keeps BOTH: #715's row-major LDE input, now emitting a row-pair tree. CPU side (verifier, proof format, commitment.rs) merged cleanly to the row-pair form — it is the fixed target the GPU must match. GPU side: - New CUDA kernel `keccak256_leaves_base_row_major_row_pair` (+ launcher `launch_keccak_base_row_major_row_pair`, device registration): the row-major analog of `keccak256_leaves_base_row_pair_batched`. Each leaf hashes two consecutive bit-reversed rows of `m` u64 lanes (base m cols; ext3 m*3, since components c0,c1,c2 are consecutive). Byte layout matches the CPU `commit_bit_reversed(.., 2)` and `verify_opening_pair`. - `coset_lde_row_major_inner` now builds `lde_size/2` row-pair leaves via the new launcher (was per-row). This is the only producer for the GPU base + ext3 trace commit (the two `_keep` wrappers). - gpu_lde.rs trace fast paths take #715's row-major dispatch unchanged (row-pair-ness is internal to the math-cuda function above). Deleted as now-unused (superseded): - The per-row row-major path: kernel `keccak256_leaves_base_row_major`, its launcher, and device field (replaced by the row-pair version). - `alloc_merkle_nodes` and the column-based `_keep` wrappers in math-cuda lde.rs (this PR's earlier GPU path; #715's row-major path replaces them). - Dead `columns_to_row_major` in prover.rs (round 1 reads the trace row-major directly via `main_data_row_major`). Fixed main-side tests for this PR's API changes: - prover_tests: `commit_rows_bit_reversed_matches_commit_bit_reversed` now compares the row-major commit against `commit_bit_reversed(.., 2)` (commit_columns_bit_reversed was removed). Validates row-major == column-major row-pair roots on CPU. - merkle_root_parity: pass `rows_per_leaf = 1` to keccak_leaves_base/ext3 (generic per-row primitive parity test). Verified locally: cargo build/check (default + `--features cuda`, incl. tests), `cargo test -p stark` (137 pass), `make lint` (all combos). The new CUDA kernel's byte-level correctness is NOT runtime-tested here (no GPU); the `cuda_path_integration` + math-cuda GPU tests must be run on a GPU host to confirm GPU↔CPU↔verifier root parity.

MauroToscano · 2026-06-29T22:25:25Z

/bench

MauroToscano · 2026-06-29T22:25:29Z

/bench-gpu

MauroToscano · 2026-06-29T22:25:34Z

/ai-review

MauroToscano · 2026-06-29T22:26:13Z

/bench-gpu 6

github-actions · 2026-06-29T22:28:12Z

Codex Code Review

No findings.

I reviewed the PR diff statically, focusing on the row-pair commitment change, verifier opening semantics, CUDA leaf/tree sizing, and changed preprocessed commitment paths. I did not run builds or tests per the review constraints.

MauroToscano · 2026-06-29T22:33:59Z

/benchmark

github-actions · 2026-06-29T22:36:45Z

AI Review

PR #735 · 30 changed files

Warning: the diff was truncated before review.

Findings

Status	Sev	Location	Finding	Found by
candidate	high	crypto/stark/src/proof/stark.rs:19	Breaking change: PolynomialOpenings.proof_sym field removed (wire-format breaking)	moonmath zro/minimax-m3
candidate	high	crypto/stark/src/prover.rs	Breaking change: IsStarkProver::commit_columns_bit_reversed removed	moonmath zro/minimax-m3
candidate	high	prover/src/tables/bitwise.rs	Static preprocessed commitments updated — verifier must recompile against new bytes	moonmath zro/minimax-m3
candidate	medium	crypto/stark/src/prover.rs:732	GPU row-pair path skipped for preprocessed tables — small proof failures possible	moonmath zro/minimax-m3
candidate	low	crypto/stark/src/commitment.rs:115	Dead code: keccak_leaves_bit_reversed (rows_per_leaf=1) wrapper kept but only used by tests	moonmath zro/minimax-m3
candidate	low	crypto/stark/src/prover.rs:776	commit_main_trace Vec::with_capacity + extend_from_slice relies on coset_lde_full_expand_row_major resize to zero-pad — works but is fragile	moonmath zro/minimax-m3
candidate	low	crypto/stark/src/prover.rs:2347	Log level downgraded from info! to log::debug! in prove_rounds_2_to_4	moonmath zro/minimax-m3
candidate	low	crypto/stark/src/trace.rs	Newly-added helper alloc_merkle_nodes / columns2rows removals leave no dead references, but verify no external use remains	moonmath zro/minimax-m3
candidate	low	crypto/stark/src/verifier.rs:438	Step_3 FRI in verifier uses BatchedMerkleTreeBackend (Vector backend) but FRI commit uses FriLayerMerkleTreeBackend (Pair backend)	moonmath zro/minimax-m3

Status column reflects the verdict from the verifier: deepseek-verifier (openrouter/deepseek/deepseek-v4-pro).

AI-001: Breaking change: PolynomialOpenings.proof_sym field removed (wire-format breaking)

Status: candidate
Severity: high
Location: crypto/stark/src/proof/stark.rs:19
Found by: moonmath:zro/minimax-m3
Verified by: -
Rejected by: -

Claim

The PolynomialOpenings struct loses its proof_sym field, making existing serialized proofs incompatible with the new verifier. Any persisted proof, downstream consumer, or external verifier that deserializes proofs from before this PR will fail to parse.

Evidence

The struct changed from { proof, proof_sym, evaluations, evaluations_sym } to { proof, evaluations, evaluations_sym } in crypto/stark/src/proof/stark.rs lines 19-23. The new verify_opening_pair reconstructs a single leaf as evaluations ++ evaluations_sym (verifier.rs lines 320-324) and the prover opens at the un-doubled index iota (prover.rs lines 1670-1676).

Suggested fix

Document the wire-format break clearly in release notes / migration guide. Consider a serde-compat shim or versioned envelope if backward-compat is required; otherwise bump a major version / add #[serde(rename = ...)] aliases for proof_sym.

AI-002: Breaking change: IsStarkProver::commit_columns_bit_reversed removed

Status: candidate
Severity: high
Location: crypto/stark/src/prover.rs
Found by: moonmath:zro/minimax-m3
Verified by: -
Rejected by: -

Claim

IsStarkProver::commit_columns_bit_reversed was a public default trait method; removing it breaks any downstream implementer or caller without notice. The replacement is crate::commitment::commit_bit_reversed but no migration note is provided.

Evidence

The diff deletes the default trait method body (prover.rs diff lines ~4404-4420). compute_precomputed_commitment_for_testing now calls crate::commitment::commit_bit_reversed(.., ROWS_PER_LEAF) (prover.rs line 619-621) but the change is undocumented for external consumers.

Suggested fix

Either keep a deprecated wrapper method that calls commit_bit_reversed (with #[deprecated]), or add an explicit migration note in the module docs.

AI-003: Static preprocessed commitments updated — verifier must recompile against new bytes

Status: candidate
Severity: high
Location: prover/src/tables/bitwise.rs
Found by: moonmath:zro/minimax-m3
Verified by: -
Rejected by: -

Claim

Hard-coded static_commitment byte arrays in bitwise, keccak_rc, and page were replaced with new values that reflect the row-pair leaf layout. Any verifier binary compiled against the prior bytes will mismatch these constants and reject valid proofs (or vice-versa if recomputation falls back).

Evidence

The diff in prover/src/tables/bitwise.rs, prover/src/tables/keccak_rc.rs, prover/src/tables/page.rs replaces 32-byte commitment literals across blowup_factor = 2/4/8 arms. There is no explicit note that downstream distributions must rebuild these or how to regenerate. The compute_preprocessed_commitment path uses commit_bit_reversed(.., ROWS_PER_LEAF) matching the new layout, so the new bytes are internally consistent — but external verifiers pinned to the old bytes are now wrong.

Suggested fix

Add a CHANGELOG entry explaining that hardcoded preprocessed commitments changed because the leaf layout changed, with instructions to rerun cargo run --bin compute_static_commitments --release. The drift tests should already catch this but the changelog entry ensures downstream distros are aware.

AI-004: GPU row-pair path skipped for preprocessed tables — small proof failures possible

Status: candidate
Severity: medium
Location: crypto/stark/src/prover.rs:732
Found by: moonmath:zro/minimax-m3
Verified by: -
Rejected by: -

Claim

The fused GPU try_expand_leaf_and_tree_row_major_keep path is only attempted when precomputed.is_none(). For preprocessed tables (which need a split into precomputed + multiplicity Merkle trees from the same row-major buffer), the prover silently falls back to CPU even when GPU would be cheaper, because the GPU helper builds a single tree for all columns.

Evidence

prover.rs lines 731-732: #[cfg(feature = "cuda")] if precomputed.is_none() { ... try_expand_leaf_and_tree_row_major_keep(...) ... }. The else branch (CPU path) handles the split via commit_rows_bit_reversed_subset twice (prover.rs lines 811-826). This is a correctness-correct but performance-relevant gate that the PR adds without explanation.

Suggested fix

Add a comment explaining the gating (GPU helper has no subset variant yet) so future readers don't try to enable it without adding the missing split path. Optionally extend the GPU helper to take a column subset.

AI-005: Dead code: keccak_leaves_bit_reversed (rows_per_leaf=1) wrapper kept but only used by tests

Status: candidate
Severity: low
Location: crypto/stark/src/commitment.rs:115
Found by: moonmath:zro/minimax-m3
Verified by: -
Rejected by: -

Claim

The keccak_leaves_bit_reversed public wrapper (rows_per_leaf=1) is kept solely for GPU parity tests; the prover no longer commits per-row for any path. This adds a tiny amount of dead production API surface.

Evidence

commitment.rs lines 109-121 keep the rows_per_leaf=1 wrapper with a comment stating it's retained only for GPU parity tests. Both commit_main_trace and the round-2 composition commit now use ROWS_PER_LEAF=2 exclusively (prover.rs lines 803, 1213-1216).

Suggested fix

Mark keccak_leaves_bit_reversed as #[cfg(any(test, feature = "test-utils"))] (or with #[doc(hidden)]) so it doesn't appear in the production public API.

AI-006: commit_main_trace Vec::with_capacity + extend_from_slice relies on coset_lde_full_expand_row_major resize to zero-pad — works but is fragile

Status: candidate
Severity: low
Location: crypto/stark/src/prover.rs:776
Found by: moonmath:zro/minimax-m3
Verified by: -
Rejected by: -

Claim

prover.rs allocates Vec::with_capacity(lde_size * total_cols) and extend_from_slice(trace_data) of length n * total_cols, leaving (lde_size - n) * total_cols slots uninitialized, then depends on coset_lde_full_expand_row_major calling buffer.resize(lde_n * num_cols, zero) to fill them. This is correct today but a future change to the LDE routine (e.g., skipping the resize) would silently read uninit memory.

Evidence

prover.rs lines 776-777: let mut main_data: Vec<FieldElement<Field>> = Vec::with_capacity(lde_size * total_cols); main_data.extend_from_slice(trace_data); — note the len is trace_data.len() = n * total_cols, not lde_size * total_cols. polynomial.rs line 590 in coset_lde_full_expand_row_major does the resize.

Suggested fix

Either explicitly call main_data.resize(lde_size * total_cols, FieldElement::zero()) after extend_from_slice (and drop the resize in coset_lde_full_expand_row_major), or add a debug_assert that buffer.len() == lde_n * num_cols inside the LDE routine.

AI-007: Log level downgraded from info! to log::debug! in prove_rounds_2_to_4

Status: candidate
Severity: low
Location: crypto/stark/src/prover.rs:2347
Found by: moonmath:zro/minimax-m3
Verified by: -
Rejected by: -

Claim

The "Started proof generation..." / "End proof generation" messages in prove_rounds_2_to_4 were changed from info! to log::debug!. For users relying on info-level logs to track prove progress, this is a silent behavior change.

Evidence

prover.rs lines 2347 and 2461 now use log::debug! instead of the prior info!. The outer multi_prove keeps info!("Started proof generation...") (line 1775) but no longer has a corresponding info-level "End" message — only the inner debug-level one.

Suggested fix

Restore info! at the end of multi_prove to keep the high-level start/end markers symmetric, since they are useful operational signals. Keep the per-table inner messages at debug.

AI-008: Newly-added helper alloc_merkle_nodes / columns2rows removals leave no dead references, but verify no external use remains

Status: candidate
Severity: low
Location: crypto/stark/src/trace.rs
Found by: moonmath:zro/minimax-m3
Verified by: -
Rejected by: -

Claim

columns2rows was removed from crypto/stark/src/trace.rs and alloc_merkle_nodes from crypto/stark/src/gpu_lde.rs. The diff updates all internal callers, but downstream crates could still import these helpers.

Evidence

Removed per the diff (trace.rs delete block, gpu_lde.rs delete block). All in-tree callers I could grep are gone. No external crate is in scope to check, but this is a public-API removal worth flagging.

Suggested fix

Same migration note as the other breaking changes. Optionally re-export a shim that emits a deprecation warning for one release.

AI-009: Step_3 FRI in verifier uses BatchedMerkleTreeBackend (Vector backend) but FRI commit uses FriLayerMerkleTreeBackend (Pair backend)

Status: candidate
Severity: low
Location: crypto/stark/src/verifier.rs:438
Found by: moonmath:zro/minimax-m3
Verified by: -
Rejected by: -

Claim

The FRI verifier uses BatchedMerkleTreeBackend (= BatchKeccak256Backend) which is the vector backend. The FRI prover uses FriLayerMerkleTreeBackend (= PairKeccak256Backend). They produce the same hash for a 2-element Vec/[T;2] but the type discrepancy is a code-smell that could mask future maintenance bugs (e.g., if the backends diverge).

Evidence

verifier.rs line 438: auth_path_sym.verify::<BatchedMerkleTreeBackend<FieldExtension>>(merkle_root, iota >> 1, &evaluations); — evaluations is a Vec of 2 elements. fri/mod.rs line 84: let merkle_tree = FriLayerMerkleTree::build(&leaves); where FriLayerMerkleTree = MerkleTree<PairKeccak256Backend<F>> and leaves are [FieldElement<F>; 2]. Both hash_data impls call hasher.update(elem.as_bytes()) per element so the digests match, but this is implicit.

Suggested fix

Use FriLayerMerkleTreeBackend (or a type alias for the FRI proof backend) consistently in the FRI verifier. The current choice works but is brittle.

Reviewer Lanes

Lane	Model	Prompt	Status	Findings
glm	openrouter/z-ai/glm-5.2	general	error: opencode failed (provider/auth/runtime error) and no findings were submitted	0
kimi	openrouter/moonshotai/kimi-k2.7-code	general	error: opencode failed (provider/auth/runtime error) and no findings were submitted	0
minimax	minimax/MiniMax-M3	general	success	0
moonmath	zro/minimax-m3	general	success	9
nemotron	openrouter/nvidia/nemotron-3-ultra-550b-a55b	general	error: opencode failed (provider/auth/runtime error) and no findings were submitted	0

Verification Lanes

Lane	Model	Status	Confirmed	Rejected	Uncertain
deepseek-verifier	openrouter/deepseek/deepseek-v4-pro	error: opencode failed (provider/auth/runtime error) and no verifications were submitted	0	0	0

Native Codex and Claude reviews run separately and post their own comments. They are not included in this structured provenance report.

Raw lane outputs, candidates, final issues, and model metrics are uploaded as workflow artifacts.

diegokingston and others added 19 commits June 19, 2026 16:28

refactor(stark): prover cleanup — par helpers, ROWS_PER_LEAF, error p…

f26a13a

…ropagation, doc/log fixes

test(prover): regenerate SUB_DECODE_COMMITMENT_BLOWUP_2 for row-pair …

c6b9832

…layout The compile-time decode-commitment const for sub.elf shifted with the row-pair preprocessed commitment; regenerated via commitment_from_elf (the print_decode_commitment_for_sub regen path).

test(prover): TEMP print actual decode commitment to regenerate const…

acd06e6

… from CI

test(prover): set SUB_DECODE_COMMITMENT_BLOWUP_2 to CI-computed row-p…

c1ba23f

…air value Regenerated from CI's sub.elf (local riscv toolchain unavailable); removed the temporary print instrumentation.

mplement changes in GPU

a1afb67

refactor

7446a9b

Merge origin/main into refactor/commitment-module

fb46406

Fix CUDA LDE clippy lint

ade38c1

fix(cuda): align review cleanup with row-pair commits (#723)

316dda9

Merge remote-tracking branch 'origin/refactor/commitment-module' into…

38b5858

… perf/commitment-row-pair # Conflicts: # crypto/stark/src/prover.rs

fix(stark): silence dead_code on par_for_each_mut (debug-checks-only …

8ddadf4

…after merge)

style(stark): cargo fmt after merge resolution

9655852

test(cuda): focused GPU row-pair commitment prove+verify test

04e1fae

diegokingston changed the title ~~Perf/commitment row pair~~ refactor(stark): unify & clean up the commitment layer Jun 29, 2026

Merge branch 'main' into perf/commitment-row-pair

1d8fa2e

Merge branch 'main' into perf/commitment-row-pair

0847a19

MauroToscano mentioned this pull request Jun 29, 2026

review(stark): address PR #735 review — coverage, dead code, cleanups #740

Merged

MauroToscano mentioned this pull request Jun 29, 2026

review(stark): tidy test layout + remove AGENTS.md (follow-up to #740) #744

Merged

MauroToscano and others added 2 commits June 29, 2026 18:34

MauroToscano approved these changes Jun 29, 2026

View reviewed changes

MauroToscano added this pull request to the merge queue Jun 29, 2026

Merged via the queue into main with commit 7f6b85e Jun 29, 2026
13 checks passed

MauroToscano deleted the perf/commitment-row-pair branch June 29, 2026 22:50

Uh oh!

Conversation

diegokingston commented Jun 29, 2026

Uh oh!

diegokingston commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark — ethrex 20 transfers (median of 3)

Uh oh!

MauroToscano commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GPU Benchmark (ABBA) — 1a166e0343 vs main (6 pairs)

Uh oh!

MauroToscano commented Jun 29, 2026

Uh oh!

MauroToscano commented Jun 29, 2026

Uh oh!

MauroToscano commented Jun 29, 2026

Uh oh!

MauroToscano commented Jun 29, 2026

Uh oh!

MauroToscano commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

Codex Code Review

Uh oh!

MauroToscano commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

AI Review

Findings

Reviewer Lanes

Verification Lanes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Jun 29, 2026 •

edited

Loading

github-actions Bot commented Jun 29, 2026 •

edited

Loading

GPU Benchmark (ABBA) — `1a166e0343` vs `main` (6 pairs)