Simplification lemmas to improve Haskell booster coverage by ehildenb · Pull Request #2859 · runtimeverification/evm-semantics

ehildenb · 2026-06-04T20:36:42Z

This PR adds a curated first tier of simplification lemmas, plus the supporting attribute annotations, so that common arithmetic, Bool, bytes, and #asWord goals discharge directly in the Haskell booster instead of falling back to kore-rpc. Each new rule is paired with a focused coverage claim in the functional specs, and the change set includes the SMT support needed to keep the MCD/DSS proofs passing once the new lemmas simplify their keccak side-conditions away.

Changes:

Add simplification lemmas across int-simplification.k, evm-int-simplification.k, bytes-simplification.k, lemmas.k, and buf.md: ==Int commutativity normalization (concrete argument on the left), arithmetic and Bool tautologies, A <=Int A +Int B short-circuit when 0 <=Int B, concrete bytes-concat identities, #asWord comparison-to-false cases, minInt/maxInt base comparisons, eq-false-lt, #widthOpCode bounds, and a concrete #powByteLen comparison rule.
Mark #widthOpCode [total, smtlib] and #asAccount [total]; add [preserves-definedness] to the rules that preserve definedness.
tests/specs/mcd/verification.k: mark the keccak disequality rule [smt-lemma] so Z3 can discharge keccak(_) =/=Int <small constant> independently of the path condition. This is required once the new lemmas simplify that constraint away, and fixes the heal/fold/frob DSS proofs and the flopper-tick/end-pack/flopper-dent rules proofs.
Add coverage claims for the new rules in lemmas-no-smt-spec.k, int-simplifications-spec.k, and abi-spec.k.

Validation:

Full prove suite (functional, rules, optimizations, DSS) on K v7.1.330.
Per-spec timing comparison against master: the lemma set is performance-neutral on the large majority of specs; the heaviest MCD/DSS proofs see roughly +20–25% from the cumulative per-step cost of the larger rule set, and the keccak [smt-lemma] itself adds ~1%.
Confirmed that all slowdown disappers when coupled with this Haskell backend PR: Booster: tunable in-place equation evaluation at rewritten subterms (--equation-max-local-steps) haskell-backend#4153
Confirmed that this PR reduces fallbacks to Kore dramatically on the KEVM test-suite (and on Kontrol).

…-asWord Without this attribute Booster refused to apply the rule because the LHS contains the non-total hook Lbl_[_]_BYTES-HOOKE. After the fix Booster correctly applies it (8× succeeded in the execute-node simplify request). Analysis: scripts/bench-prove.py --analyse-fallbacks shows the "Uncertain about definedness of rule due to: non-total symbol Lbl_[_]_BYTES-HOOKE" error gone, replaced by successful Booster applications. The rule-index gap and indeterminate-match cases for other equations remain and will be addressed in follow-up commits. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 25e00d935ac1f9ce25394e40865dbb36c0cbd80b) (cherry picked from commit 6ae0def0a1de63aa5ffb13d1cdb2abe7d5c3651d)

bytes-simplification.k: add preserves-definedness to lengthBytes-buf and lengthBytes-range. evm-int-simplification.k: add asWord-buf-inversion-rangeUInt256. (cherry picked from commit d4dad3335e7e98d76057c9c0b07387a21a20c0b2) (cherry picked from commit c60634bc589f05a98aa30b06409df4ed478d407f)

…n-left Kore's execute-fallback applies INT-KORE.eq-int-true-rigth which can swap ==Int argument order in path conditions (e.g. lengthBytes(BA) ==Int 32 → 32 ==Int lengthBytes(BA)). This makes Booster's implies check fail due to a syntactic mismatch even when the logical content is identical. The new rule [int-eq-comm-concrete] at priority 30 fires before the +Int cancellation rules and normalises concrete-on-left to concrete-on-right, restoring the form the target spec expects. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 1d384970bda6669b479959a31f152c0e8aedde35) (cherry picked from commit 2cf8e8c620fee897758f02f5b2b512595428fbec)

…ation lemmas Adds to INT-SIMPLIFICATION-COMMON: - A <=Int A => true (reflexivity; fixes chop-overflow-02 under --booster-only-simplify) - A <=Int A +Int B => 0 <=Int B - A +Int ((B -Int A) +Int C) => B +Int C (4-term cancellation; fixes cancellation-02) - A <Int B andBool B <Int A => false (strict order contradiction; fixes b2w-03) Adds to INT-SIMPLIFICATION-HASKELL: - A <Int B andBool C <=Int A => false requires B <=Int C [concrete(B,C)] (range contradiction) - C <=Int A andBool A <Int B => false requires B <=Int C [concrete(B,C)] (range contradiction) (requires is purely concrete, evaluated by LLVM hook — no circular self-application risk) Adds to LEMMAS-HASKELL: - B orBool notBool B => true (and commuted variant; fixes b2w-05, chop-additional-knowledge) - B andBool notBool B => false (and commuted variant) Validation: full functional test suite passes in normal mode (slot-updates-spec.k failure is pre-existing, unrelated to these changes). Under --booster-only-simplify, lemmas-spec.k drops from 22 to 18 failing claims. No loops detected (checked via kevm-analyse hot-rules). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 3bb01297db74cacf75fa138f0550440882df8de1) (cherry picked from commit 4c13678fdb82d593bfc495cf0418804a3e505c90)

…en 0 <=Int B Add a higher-priority (40) variant of `A <=Int A+B => 0 <=Int B` that short-circuits directly to `true` when `0 <=Int B` is already in the path condition, fixing `chop-overflow-01` in --booster-only-simplify mode where the intermediate `0 <=Int Y` form cannot be discharged at the implies check. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit bfb0225f96a28cbf42ddd520dd49773b234944e5)

In K, .Bytes (the constructor) and b"" (LLVM-evaluated concrete empty bytes, a Kore domain value \dv{SortBytes}("")) are different Kore terms. The existing bytes-concat-empty-right/left rules use .Bytes and do not fire when LLVM hooks (e.g. #encodeArgs base case) return b"". New rules: [bytes-concat-empty-right-concrete]: B +Bytes b"" => B [bytes-concat-empty-left-concrete]: b"" +Bytes B => B Fixes booster-only failure for encodepacked-keccak01: the contract evaluates keccak(1 : #encodeArgs(#uint256(A0))) where the #encodeArgs base case produces b"", leaving keccak(X +Bytes b"") vs keccak(X) as the implies-check mismatch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 829d293906d6efb769649002b2cea2ebf90235fc) (cherry picked from commit a4669b0942edb8c974750490affdad4e1ed1d54d)

…ication, add coverage claim The rule `2 ^Int (8 *Int SIZE) => #powByteLen(SIZE)` was [mixed] in booster: it sometimes matched successfully but sometimes failed because booster rejected the rewrite without proof that #powByteLen(SIZE) is defined. Since #powByteLen is a function symbol with no-evaluators (always yields a defined term), adding [preserves-definedness] is safe and fixes the gap. This was the root cause of the kore-vs-booster gap in ecrecoverloop02-sig1-invalid and ecrecoverloop02-sigs-valid benchmarks specs. The new abi-spec.k claim pow-byte-len-simplify exercises this rule directly in booster-only mode and passes after the fix. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit efa0e111609946d7c3506e3a7ae2c6d52cef16e4) (cherry picked from commit fcab3c84c70d9034bee5d6a3b3152c306b6e9d00)

…nedness] to 46 comparison normalization rules, add coverage claims Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit ccce6b5dc1442f51df8939c71b3e378cf74c5894) (cherry picked from commit 2b7099cf814c7ed27e8bc5af007cd52591e7b869)

@b

… BOOL-KORE @-variable patterns, add coverage claim The BOOL-KORE module in K's domains.md defines 8 rules using @-prefixed "here variables" (e.g., {true #Equals notBool @b}) which booster cannot match. Add equivalent rules with regular B:Bool variables to LEMMAS-HASKELL so booster can apply the same normalizations. Add a functional spec claim exercising the notBool case, which passes in booster-only mode. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 80ed6938ac79faf1a98ef55ddc2a0226a3ae39b2) (cherry picked from commit c498024497ff2a65fd6cc58916db745813f5acf4)

… add preserves-definedness to range-empty The range-empty rule (#range(_,S,W) => .Bytes when S<0 or W<=0) previously lacked [preserves-definedness], so booster refused to apply it during the implies check. Tag it accordingly and add two coverage claims to lemmas-no-smt-spec.k to verify it fires in booster. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 666ba06c5956d8b9b540446089f73b6a45fba000) (cherry picked from commit f30c2ff9ac9e7bdf0af6f10b47f716f5e0f1728e)

…parison rule After commit 5d0d8083c added [preserves-definedness] to the 2^Int(8*SIZE) => #powByteLen(SIZE) rule, booster fires it consistently even when SIZE is concrete (the rule fires before path-condition lookup makes the argument concrete). Because #powByteLen has no-evaluators, the result #powByteLen(N) cannot self-evaluate, leaving CONST <Int Add [powByteLen-lt-concrete] to bridge that gap: when both CONST and N are concrete the requires-clause delegates back to 2^Int arithmetic, which can be evaluated directly. Add a coverage claim in lemmas-no-smt-spec.k to verify the rule fires. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 8b4972c4cc410995c3ec2360ca6dd50c189d374b) (cherry picked from commit 11b7050dfc88d903d1e1c1cb62802b6db25b9187)

#asAccount is total over its Bytes domain — the two rules (lengthBytes == 0 → .Account; [owise] → #asWord(BS)) are jointly exhaustive. Marking it [total] lets booster's definedness checker clear the RHS-definedness gate on rules that contain #asAccount in their RHS — notably [call.delegatedAuthority] at evm.md:1633, where the symbolic argument #range(CODE, 3, 20) prevents concrete evaluation and previously caused booster to abort with "Uncertain about definedness of rule due to: non-total symbol Lbl#asAccount", yielding to kore for the rewrite step. The recover-mode sweep across functional + examples + erc20 + benchmarks identified 12 kore-execute handoffs (all benchmarks ecrecover variants) that traced to exactly this definedness abort. Validated locally on benchmarks/ecrecover00-siginvalid-spec.k: 2 handoffs → 0, proof still passes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> (cherry picked from commit 33c04ff6558c339ffc06759dcaa49cd139222ef8) (cherry picked from commit 4ce956c09bb03a7d06fdf533f95b2cfaa40b0dc7)

…comparison rules Tier-1 subset of upstream 258a68c71: keep the two base rules (sound since minInt(A,_) <= A and C <= maxInt(C,_)); drop the non-linear shift/factored variants pending review. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…cations Tier-1 subset of upstream 584462824: the asWord-{lt,le,eq}-false rules (contrapositives of path-condition facts; RHS is false, so trivially definedness-preserving). Drops that commit's slot-updates-spec.k edits (claims regressed PASS->FAIL and need investigation). Coverage claims for these rules are already in lemmas-no-smt-spec.k. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Tier-1 subset of upstream d0f4bdec3: keep eq-false-lt (A ==Int B => false when A <Int B, B concrete). Drops le-1073741824-maxUInt64, whose 2^30 bound is a non-general magic constant pending review. Coverage claim is already in lemmas-no-smt-spec.k. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…und + concat-lookup simplifications Consolidates the Tier-1 #widthOpCode work from upstream Rounds 2-5 (which iteratively rewrote the same lines): #widthOpCode is total (PUSH ops -> 2..33, owise -> 1) and gets an smtlib symbol; widthOpCode-lb/ub encode its 0..33 range as smt-lemmas; widthOpCode-concat-{left,right} push a concrete-prefix byte lookup through +Bytes. Drops the Round-4 asWord-range-buf and the #computeValidJumpDests preserves-definedness riders pending review. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ication rules For a [simplification] rule the booster's definedness obligation is the set of non-total, non-constructor symbols on the LHS and RHS (Internalise.hs); when that set is empty the rule is determined definedness-preserving automatically and the attribute is a no-op. Removed it from the 66 rules added here whose every symbol is total (`+Int -Int *Int` / comparisons / `minInt maxInt lengthBytes +Bytes notBool` / the `total` functions `#asWord #buf #range #widthOpCode`), i.e. the comparison-normalization block, eq-false-lt, the minInt<maxInt base rules, asWord-{lt,le,eq}-false, asWord-buf-inversion, range-empty, lengthBytes-buf/range, b"" concat identities, and the Bool #Equals distributions. Kept on the 5 rules that root a partial symbol (where the attribute does real work): the two #powByteLen rules (no-evaluators), lookup-as-asWord and widthOpCode-concat-{left,right} (partial BYTES.get `_[_]`). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Per review: N need not be syntactically concrete -- once N is concrete the `requires CONST <Int 2^Int(8*N)` reduces to a linear inequality, and if N is symbolic the requires is undischargeable so the rule simply does not fire. (It does not subsume `SIZE <Int #powByteLen(SIZE)`, which is the diagonal same-variable case.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Per review: bytes-simplification.k already provides bytes-concat-lookup-{left, right}, so these were only a workaround for booster not descending into #widthOpCode(...) to apply them. Remove rather than duplicate per-function; if the non-descent persists it should be addressed in the backend. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…used bool-eq #Equals rules Whole-prove-suite rule-firing measurement (booster Simplify + kore SimplifyKore logging over functional/benchmarks/erc20/examples/kontrol/mcd) shows the eight bool-eq-* ML-#Equals distribution rules are never applied by the booster: five fire in no engine at all, and three (bool-eq-not-true-l/r, bool-eq-and-true-l) fire only in the kore-rpc implies/subsumption path. Since this PR targets booster performance and these proofs already discharge those goals under kore without the rules, all eight are removed. The bool-*-self rules are kept (they fire in the booster). The bool-eq-not-false coverage claim, which was discharged by bool-eq-not-true-l rather than by the not-false rules, is removed with them. Verified: lemmas-no-smt-spec proves all 22 remaining claims under --use-booster. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…o fix DSS/MCD implies failures The new tier1 simplification lemmas simplify away the keccak path-condition constraint during EVM execution, so the final subsumption check can no longer prove `keccak(_) =/=Int smallInt` from the path condition alone. Encoding the rule as an SMT lemma lets Z3 discharge it independently of the path condition. This matches the treatment already present in mcd-structured/verification.k, whose sibling proofs pass; the plain mcd suite lacked the annotation. Fixes the heal/fold/frob-diff-zero-dart DSS proofs and the mcd flopper-tick/end-pack rule proofs. Verified mcd/end-pack-pass-rough-spec.k passes with the change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…(review)" This reverts commit 6cd5f0d.

…under booster-dev Verified by running the booster-dev binary over the full skip-list on this branch: mcd-structured/dsvalue-peek-pass-rough now passes under tier1's lemmas (failed on master), and 4 entries (functional/bitwise-{asword,simplifications}, mcd[-structured]/dsvalue-read-pass-summarize) were stale (already passing on master). bitwise-mask-shift-spec.k is intentionally kept: it regresses under booster-dev on this branch. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

ehildenb · 2026-06-09T14:03:05Z

(i) Performance changes

Caveat: no clean perf benchmark was run. Numbers below are recover-mode wall-clock under concurrent np (≈2× RPC, no caching, cross-run contention) — directionally fine, not authoritative. The dedicated isolated normal-mode methodology (CLAUDE.md "Comparing haskell-backend builds") is the right tool for real perf and is a recommended follow-up.

scope	master	tier1	delta
functional suite wall-clock	1583 s	1601 s	+1%
rules suite wall-clock	3255 s	4019 s	+23%*
rules, sum over 124 both-passed specs	11949 s	12319 s	+3% (noise)
dss (vat-spec, capped)	10800 s (timeout)	10800 s (timeout)	—

*The +23% rules wall-clock is not a slowdown: tier1 completes 14 more proofs that master fails/crashes out of quickly. The fair per-spec view (both-passed specs) is +3%, i.e. flat within contention noise. Per-spec movers are mixed and small (largest: mcd/flopper-dent-guy-same +105 s, ecrecoverloop02-sigs-valid −56 s) — consistent with noise, not a systematic regression.

Takeaway: no evidence of a performance regression; a proper isolated perf run is needed to claim any speedup.

ehildenb · 2026-06-09T14:03:27Z

(ii) Fallback changes (the headline — clear reduction)

suite	metric	master	tier1	delta
functional	total handoffs (488 common claims)	21	11	−48%
functional	flavour	implies 21	implies 11	implies-only
rules	total handoffs (124 both-passed specs)	73	58	−21%
rules	execute / implies	30 / 43	16 / 42	execute ~halved
rules	all 264 common claims	105	81	−23%
dss (vat-spec, partial)	total handoffs (36 common claims)	120	33	−72%
dss	execute / implies	109 / 11	21 / 12	execute ~−80%

The dominant, consistent effect is a large reduction in execute-flavour fallback (the booster aborting a symbolic-execution rewrite and handing the step to Kore): ~½ on rules, ~⅕ on DSS. implies-flavour fallback is flat on rules/dss and is the entire (smaller) functional effect. Reductions concentrate where tier1 added lemmas: the ecrecover/ecrecoverloop benchmarks, MCD flapper/flopper/end/pot proofs, and the DSS vat proofs (helped further by tier1's keccak-disequality [smt-lemma] fix).

Even stronger than the counts — recover-mode provability (does the proof close booster-first at all):

suite	master pass/fail	tier1 pass/fail
rules	124 / 18	138 / 4

14 rules specs that master can only close with Kore now close booster-side on tier1; 0 regressions. (The 4 fail-on-both are recover-mode limitations — see anomalies.)

ehildenb · 2026-06-09T14:03:38Z

(iii) Booster-dev changes (skip-list)

Verified by running the standalone booster-dev binary on both branches over the full 137-entry failing-symbolic.haskell-booster-dev list (neutralised, then restored). Methodological note: recover-mode "0 handoffs" is not a proxy for booster-dev provability — of 82 specs that looked booster-only in recover-mode, only 5 actually pass booster-dev. The verification was essential.

Removed from the skip-list on this branch (5; all pass booster-dev on tier1):

spec	master booster-dev	tier1 booster-dev	attribution
`mcd-structured/dsvalue-peek-pass-rough-spec.k`	fail	pass	tier1-specific
`functional/bitwise-asword-spec.k`	pass	pass	stale
`functional/bitwise-simplifications-spec.k`	pass	pass	stale
`mcd/dsvalue-read-pass-summarize-spec.k`	pass	pass	stale
`mcd-structured/dsvalue-read-pass-summarize-spec.k`	pass	pass	stale

1 removal is tier1's doing; 4 are stale entries the list had drifted on (already passing on master). Skip-list: 137 → 132 lines.

ehildenb · 2026-06-09T14:03:46Z

(iv) Anomalies / non-passing proofs

#	anomaly	detail
1	booster-dev regression	`functional/bitwise-mask-shift-spec.k` passes `booster-dev` on master, fails on tier1 (stable recheck, 1 failing node — not a timeout). Still passes in normal & recover mode; skip-listed so no CI break. Kept on the list. A tier1 lemma edit weakened the standalone-booster path for this claim — worth the author's look.
2	recover-mode fail-on-both	4 specs fail under recover-mode on both branches — `examples/sum-to-n-foundry`, `kontrol/test-{allowchangestest-testfailallowchangestostorage, countertest-testincrement, owneruponlytest-testincrementasowner}-0`. These are recover-mode limitations (stuck/indeterminate), independent of the lemmas; all pass in normal mode.
3	minor handoff regression	`examples/erc721-spec.md` rises 1→2 implies-handoffs on tier1 (single spec, implies-flavour).
4	implies-count artifact (rules)	Over all common claims, implies-handoffs appear to rise 45→53 on tier1 — an artifact of tier1 completing 14 extra proofs that master never reached. On the both-passed set, implies is flat (43→42).
5	DSS incomplete	Both branches hit the 3 h cap on the 2-spec dss test; the −72% is over the identical first tranche (`mcd/vat-spec.k`, 36 claims), not an end-to-end completion. recover-mode is too slow for full DSS; a normal-mode `count-aborts.py` pass is the right end-to-end instrument.

…s-summarize specs failing booster-dev in CI PR #2859 CI 'Proofs: Rules (booster-dev)' failed on mcd/dsvalue-read-pass-summarize and mcd-structured/dsvalue-read-pass-summarize (SystemExit: 1); local verification diverged from CI. Re-add them; the other three drops (mcd-structured/dsvalue-peek-pass-rough and functional/bitwise-{asword,simplifications}) pass in CI and stay removed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

ehildenb · 2026-06-11T05:47:22Z

Overall, these lemmas alone are a mild performance hit (~6%). But coupled with runtimeverification/haskell-backend#4153 with the optional local simplification recursion set to 20, become performance neutral overall (and performance positive in many simplification-heavy cases). These lemmas alone eliminate many fallbacks to Kore though, so this gets us closer ot our goal of eliminating the Kore backend altogether.

…--equation-max-local-steps) (#4153) This PR generalizes booster's equation evaluation strategy. Today, when an equation rewrites a subterm during the bottom-up simplification traversal, re-normalization happens by restarting whole-term passes from the top, so a rewrite chain advances one step per pass. This PR adds a per-pass budget, `--equation-max-local-steps N`: after a rewrite, the result is re-simplified in place (LLVM evaluation plus re-entering the bottom-up traversal) for up to N steps per pass before deferring back to the global restart loop. N=0 — the default — is exactly the existing restart-only behavior as the degenerate case of the same code path, so this changes nothing out of the box; the knob lets users tune how deep in-place simplification chains may run. **Changes:** - New server option `--equation-max-local-steps` (default 0, identical to current behavior); the budget resets every global pass. - `Booster.Pattern.ApplyEquations`: on a rewrite in the bottom-up pass, `localFixpointEval` re-enters the cached traversal in place, counting re-entries against the budget. - Termination stays fully bounded: at most `--equation-max-iterations` passes times (budget + 1) applications per chain, with two-layer loop detection (per-step chain check inside the budget, whole-term snapshots across passes, which catch cycles longer than the budget). - Unit tests pin the default-0 boundary (identical to the old iteration-limit behavior) and the budgeted behaviors (deeper chains, per-step loop detection, combined bound) in an explicit options window. **Validation:** - At the default, the unit suite and rpc-integration tests are unchanged, including the `test-log-simplify-json` log-trace golden — the default is trace-identical to master. - KEVM dose-response sweep (15 biggest-moving specs, N ∈ {0,1,2,3,4,5,20,100,1000}, all specs passing in every configuration): budgets 1–3 give about −30% total wall-clock (1530 s at N=0 → 1042 s at N=2), with per-spec wins of −30…−50% saturating at N=1–2. - The sweep also shows why this is a knob and not yet a new default: at N≥4 one spec family (`ecrecoverloop02`) regresses +40–50%, because an in-place chain completes a second nested memory write before the restart-pass concat rules flatten the first, and booster hands the resulting shape to Kore (Kore-side equation applications jump 2 980 → 26 477). The regression is an indicator of missing simplification lemmas / `preserves-definedness` annotations, not of the strategy. - Re-running the sweep with an extended lemma set that fills those gaps removes the cliff entirely: improvement becomes monotone in the budget, with the best total at N=20 (−35% vs N=0). Once the lemma set keeps simplification in-house, re-simplifying at the same subterm until done is a net performance win. This is the extended lemmas set: runtimeverification/evm-semantics#2859. On that PR, without this change KEVM has some proofs that have major regressions. Coupled with this PR, and setting N=20, we get only performance wins there. --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

ehildenb and others added 18 commits June 4, 2026 20:15

evm.md: formatting

ee9a0d2

ehildenb commented Jun 4, 2026

View reviewed changes

Comment thread kevm-pyk/src/kevm_pyk/kproj/evm-semantics/lemmas/evm-int-simplification.k

ehildenb commented Jun 4, 2026

View reviewed changes

Comment thread kevm-pyk/src/kevm_pyk/kproj/evm-semantics/lemmas/lemmas.k Outdated

ehildenb commented Jun 4, 2026

View reviewed changes

Comment thread kevm-pyk/src/kevm_pyk/kproj/evm-semantics/buf.md

ehildenb and others added 2 commits June 4, 2026 21:09

ehildenb commented Jun 5, 2026

View reviewed changes

Comment thread kevm-pyk/src/kevm_pyk/kproj/evm-semantics/lemmas/evm-int-simplification.k

ehildenb and others added 4 commits June 5, 2026 14:01

Revert "buf.md: relax powByteLen-lt-concrete to concrete(CONST) only …

293a1a4

…(review)" This reverts commit 6cd5f0d.

Merge remote-tracking branch 'upstream/master' into lemmas-tier1

d7232a8

ehildenb changed the title ~~Lemmas tier1~~ Simplification lemmas to improve Haskell booster coverage Jun 8, 2026

ehildenb and others added 3 commits June 9, 2026 15:49

Merge remote-tracking branch 'upstream/master' into lemmas-tier1

bcb4061

Merge remote-tracking branch 'upstream/master' into lemmas-tier1

a864119

ehildenb mentioned this pull request Jun 10, 2026

Booster: tunable in-place equation evaluation at rewritten subterms (--equation-max-local-steps) runtimeverification/haskell-backend#4153

Merged

ehildenb commented Jun 11, 2026

View reviewed changes

Comment thread kevm-pyk/src/kevm_pyk/kproj/evm-semantics/buf.md

ehildenb marked this pull request as ready for review June 11, 2026 05:57

ehildenb requested review from anvacaru and juliankuners June 11, 2026 05:57

anvacaru approved these changes Jun 11, 2026

View reviewed changes

ehildenb merged commit b460c09 into master Jun 11, 2026
12 checks passed

ehildenb deleted the lemmas-tier1 branch June 11, 2026 12:38

palinatolmach mentioned this pull request Jun 11, 2026

Update dependency: deps/kevm_release runtimeverification/kontrol#1153

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplification lemmas to improve Haskell booster coverage#2859

Simplification lemmas to improve Haskell booster coverage#2859
ehildenb merged 28 commits into
masterfrom
lemmas-tier1

ehildenb commented Jun 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ehildenb commented Jun 9, 2026

Uh oh!

ehildenb commented Jun 9, 2026

Uh oh!

ehildenb commented Jun 9, 2026

Uh oh!

ehildenb commented Jun 9, 2026

Uh oh!

Uh oh!

ehildenb commented Jun 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ehildenb commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ehildenb commented Jun 9, 2026

(i) Performance changes

Uh oh!

ehildenb commented Jun 9, 2026

(ii) Fallback changes (the headline — clear reduction)

Uh oh!

ehildenb commented Jun 9, 2026

(iii) Booster-dev changes (skip-list)

Uh oh!

ehildenb commented Jun 9, 2026

(iv) Anomalies / non-passing proofs

Uh oh!

Uh oh!

ehildenb commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ehildenb commented Jun 4, 2026 •

edited

Loading

ehildenb commented Jun 11, 2026 •

edited

Loading