spike(stark): symbolic-field capture of transition constraints → flat IR (CPU)#737
Draft
MauroToscano wants to merge 5 commits into
Draft
spike(stark): symbolic-field capture of transition constraints → flat IR (CPU)#737MauroToscano wants to merge 5 commits into
MauroToscano wants to merge 5 commits into
Conversation
… flat IR CPU-only proof-of-concept toward evaluating constraints on the GPU (to keep the prove pipeline data-resident). Adds crypto/stark/src/symbolic/: SymField/SymExt recording fields (impl IsField + IsSubFieldOf via a thread-local hash-consing arena), a typed Dim1/Dim3 op-DAG IR, a capture front-end, and a CPU interpreter. Reuses constraint bodies unchanged: running evaluate::<SymField,SymExt> records the IR. SymField needs only IsField + IsSubFieldOf (capture never builds an AIR<Field=SymField>); inv/div/ByteConversion are unreachable stubs and eq returns false. Diff test (prover) captures IsBit/Add/ProductZero and asserts the IR interpreter matches the real evaluate bit-for-bit over 1000 random rows. Out of scope (follow-ons): GPU kernel, LogUp constraints, wiring into evaluate_transitions. Design notes in thoughts/gpu-constraint-eval/.
…checkbox continuation plan
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
A CPU-only, minimal spike for capturing algebraic transition constraints into a flat,
single-field Goldilocks IR and interpreting it — the first step toward evaluating
constraints on the GPU.
The motivation is not to speed up constraint evaluation (it isn't the CPU bottleneck).
It's data residency: once LDE/Merkle/FRI run on the GPU, leaving constraint evaluation
on the CPU forces a round-trip of the (large) LDE trace across the bus, which dominates. To
keep the pipeline on-device we need the constraint logic in a GPU-executable form — which
means capturing it into an IR a kernel can interpret (the approach SP1, OpenVM, and zisk all
use; OpenVM is our closest match — FRI-STARK / LDE quotient).
This PR proves the capture half works, on the CPU, without touching any constraint body.
What's in it
New module
crypto/stark/src/symbolic/:sym_field.rs—SymField/SymExtrecording fields: zero-size markers implementingIsField(+IsSubFieldOf<SymExt>) whose ops record nodes into a thread-localhash-consing arena instead of computing.
ir.rs— the IR: a typedDim1/Dim3op-DAG (Add/Sub/Mul/Neg/Embed+Const/Varleaves).capture.rs— runs a constraint's existingevaluate::<SymField, SymExt>and snapshotsthe recorded arena into a
ConstraintProgram. Zero edits to constraint bodies.interp.rs— a CPU interpreter: forward pass over the nodes reusing realFieldElementarithmetic (so per-op results are bit-identical to the boxed path).
Test (
prover/src/tests/symbolic_ir_tests.rs): capturesIsBit(cond + uncond),AddConstraint(both carries), andProductZero, then asserts the IR interpreter matchesthe real
evaluatebit-for-bit over 1000 random rows each.Results
starkandlambda-vm-proverbuild;cargo fmt/clippyclean.AddConstraint, which exercisesfrom(i64)coefficients and the
inv_2_32constant.SymFieldneeds onlyIsField+IsSubFieldOf<SymExt>—not
IsFFTField/IsPrimeField— because capture never instantiates anAIR<Field=SymField>. The methods that can't be symbolic (inv,div, realByteConversion) are provably unreachable and left asunimplemented!();eqreturnsfalseso runtime zero-skip optimizations aren't taken during capture.(a spike artifact of building all
NUM_COLSleaves); the real per-constraint logic is ahandful of nodes (e.g.
product_zero= 64 leaves + zero + 1Mul). A leaf-on-demand /DCE pass removes the padding.
Out of scope (follow-ons)
quotient.cu+ transpiler/codec —three-address code, register allocation, fused
Σ αⁱ·Cᵢ·Zᵢ⁻¹accumulation on-device).LookupBatchedTerm/LookupAccumulated) — capturable thesame way (their inner fns are already field-generic) but need the challenge/alpha-power
context; left out of the minimal set.
ConstraintEvaluator::evaluate_transitions.Notes
Design docs (motivation, the two capture approaches considered, OpenVM/SP1/zisk findings,
recommendation) are in
thoughts/gpu-constraint-eval/. This is a draft spike, notintended to merge as-is — it de-risks the
SymField-as-IsFieldquestion and thecapture→IR→interpret equivalence before investing in the GPU kernel.