Add formally verified natural logarithm#585
Open
duncancmt wants to merge 94 commits into
Open
Conversation
`Ln.lnWad` returns `floor(L)` or `floor(L) - 1`, never more, where L = 10**18 * ln(x / 10**18) is the exact result; it reverts with `LnWadUndefined()` for x <= 0. The atanh substitution is centered at s = round(sqrt(2) * 2**96), so |z| <= 3 - 2*sqrt(2) and a (3,3)-degree rational minimax with Q96 coefficients (certified sup-norm error 2.532e-19) suffices. ln(s) folds into the constant term, which is biased down by a 3e22 margin: certified upward errors total <= 2.63e22, and margin plus downward errors stay below the final ulp of 2**78, so the Q78 accumulator always lies in ((L - 1) * 2**78, L * 2**78] and `sar(78)` lands on floor(L) or floor(L) - 1. Gas: ~304 per call, measured through an external wrapper net of call overhead. Verified on 718k+ structured, adversarial, and random inputs against 1500-bit references using an exact EVM-semantics mirror of the implementation (formal/python/ln/check_ln_counterexample.py), with pinned-vector, revert, and fuzz coverage in test/0.8.34/Ln.t.sol. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <codex@openai.com>
`Ln.lnWad` takes a wad (10**18 basis) input and returns `floor(L)` or `floor(L) - 1` for L = 10**27 * ln(x / 10**18), never more; it reverts with `LnWadUndefined()` for x <= 0. The ray ulp (1e-27 of ln) requires ~96 significant bits, so the rational is a (4,5)-degree minimax of atanh(sqrt(u))/sqrt(u) fit under the sqrt(u) weight the error carries into ln, with q monic and p(0) = q(0) constrained so the constant-term literal is shared. Certified weighted sup error of the integer-rounded rational: 0.325 ulp. Every quantity lives at the fixed-point basis its consumer needs and is rounded exactly once: the mantissa at Q103, z at Q100, u at Q96, and the Horner stages on a basis staircase (p: Q68/Q72/Q78/Q85/Q94; q: Q96/Q71/Q77/Q85/Q94) whose sar immediates absorb every basis change. A coefficient followed by j more multiplies by u tolerates ~5.1 fewer fractional bits per unit of j; each literal takes the widest basis that fits its minimal PUSH width, so quantization stays negligible at the minimal code size. The quotient p*z/q is taken once at Q100 and rescaled by exactly 5**27 onto the 10**27 * 2**72 output grid shared by the k*ln(2) term and the bias, leaving the closing `sar(72)` as the single output rounding. Certified budget: upward error 0.343 ulp <= 1.7e21-unit margin (0.360 ulp); margin plus downward errors 0.703 ulp < 1, so the accumulator always lies in ((L-1)*2**72, L*2**72]. Gas: ~351 per call measured through an external wrapper net of call overhead; wrapper code 437 bytes (the uniform-basis layout of the same rational measures ~362 gas and 463 bytes). Verified on 718k+ structured, adversarial, and random inputs against 1600-bit references using the exact EVM-semantics mirror in formal/python/ln/check_ln_counterexample.py, with pinned-vector, revert, and fuzz coverage in test/0.8.34/Ln.t.sol. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
`Ln.lnWad` now returns exactly 0 for x = 10**18 -- the one input whose exact result is an integer -- via a branchless `eq` correction added to the floored accumulator (the uncorrected accumulator deterministically floors to -1 there; pinned by test, by the Python mirror, and by the Lean theorem `model_ln_wad_one_wad`). `Ln.lnWadToWad` wraps `lnWad` with floor division by 10**9 (truncating `sdiv` alone would mis-round negative results), inheriting the floor-or-floor-minus-1 contract, monotonicity, and the zero point at wad basis. `lnWad` is monotone: x1 < x2 implies lnWad(x1) <= lnWad(x2). Ensuring this required raising the Horner stage truncation bases (p: Q68/Q80/ Q86/Q93/Q94, q: Q96/Q87/Q85/Q93/Q94, +7 literal bytes, op count unchanged): the within-octave step inequality needs the stage truncation jitter bound z_max*2J < R_min, which these stage bases satisfy with the certificate margin. Verification coverage: - Proven in Lean (formal/ln/LnProof, kernel `decide`, zero axioms, against a model generated from the Yul IR by formal/python/ln/generate_ln_model.py): monotonicity across all 254 clz seams for both functions, lnWad(10**18) = 0 and lnWadToWad(10**18) = 0, and order at the corrected point. The model generator strips the `LnWadUndefined()` revert guard (the memory model requires straight-line stores); all theorems quantify over the non-reverting domain. - Covered by exact rational arithmetic in formal/python/ln/check_ln_monotone.py: the within-octave leg of the monotonicity argument, using integer interval propagation of the Horner stages, truncation-jitter bound with step margin 0.829, polynomial-derivative positivity by exact interval bisection, and mantissa-to-z antitonicity. - Differentially tested: the Lean model evaluator (`ln-model`) against the deployed Solidity for 2x20k fuzz runs plus the full vector suite via FFI (test/0.8.25/formal-model/LnModel.t.sol); the Python mirror against 1600-bit references on 718k+ structured, adversarial, and random inputs; 2.4M adjacent-mantissa pairs and all seams scanned with no monotonicity violation. `sdiv`, `sar`, `slt`, and `sgt` are added to the modeled builtins in formal/python/evm_builtins.py with two's-complement Lean semantics. .github/workflows/ln-formal.yml runs the certificates, regenerates the model, builds the proofs, diffs the axiom report, and runs the FFI fuzz on every change to the relevant files. Gas: ~377 per `lnWad` call, ~376 per `lnWadToWad` call, measured through external wrappers net of call overhead. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <codex@openai.com>
The monotonicity argument is now machine-checked end to end against the generated model: `model_ln_wad_mono` proves that for all 0 < x <= y < 2**255, the signed output words satisfy lnWad(x) <= lnWad(y), and `model_ln_wad_to_wad_mono` carries the result through the wad helper's floor division. Both depend only on the standard Lean axioms (propext, Classical.choice, Quot.sound). The proof chain: - Poly.lean: dense integer polynomials, interval-Horner evaluation, and a fuel-bounded adaptive bisection checker whose soundness theorem turns a kernel-`decide`d `true` into nonnegativity over an integer interval. - Bridge.lean/BridgeDiv.lean: two's-complement transport lemmas taking each modeled EVM opcode (add/sub/mul/shl/shr/sar/sdiv) to plain `Int` arithmetic under explicit range hypotheses. - Stages.lean: each Horner stage of the p/q evaluation sandwiched against an exact integer polynomial (scales 2**358 and 2**386) with one-sided truncation slop. - Certs.lean: the four analytic certificates (p-range, q-range, and the two step inequalities G1/G2 covering positive and nonpositive z) checked by the bisection checker via kernel `decide`. - StepMono.lean: the within-octave unit step -- the truncated quotient pipeline X1 = sdiv(p(u)*z, q(u)) is antitone over consecutive integer z-values in [-Zc, Zc]. - ZOctave.lean: the mantissa-to-z map is antitone with unit steps, z stays in [-Zc, Zc], and the chained X1-vs-mantissa monotonicity and magnitude bound. - OctaveMono.lean: the generated model re-expressed through the (eq, exponent, mantissa) triple, the ln2*k term bracketed by kernel evaluation over all 256 clz values, and the fixed-exponent affine tail (multiply by K, add ln2*k and the bias, sar 72) shown monotone. - TopMono.lean: clz/log2 plumbing, extraction of the decided seam and 10**18 theorems, the unit step over the full domain, the induction that chains it, and the floor-division corollary for lnWadToWad. The CI axiom check now imports LnProof.TopMono and pins the axiom sets of the two monotonicity theorems. Ln.sol's NatSpec and the Python certificate's docstring now describe the Lean proof as the authority, with check_ln_monotone.py as its executable exact-rational counterpart. Verified: lake build (all 14 jobs, no warnings in the changed files); the CI axiom-check diff matches; check_ln_counterexample.py and check_ln_monotone.py pass; the Lean-model FFI fuzz suite passes 10/10; test/0.8.34/Ln.t.sol passes 8/8; forge fmt --check clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <codex@openai.com>
Set the one-sided bias margin to exactly 0.500 ulp by changing only the BIAS constant (same 21-byte literal width; gas and code size unchanged). The output contract is unchanged: the result is still floor(L) or floor(L) - 1, never an overestimate, and lnWad(10**18) == 0 still lands via the branchless correction (the uncorrected accumulator at 10**18 is exactly -1). The certified upward error of the approximation pipeline is 0.329 ulp and the analytic legs of the floor specification need the gap between that error and the margin on one side, and between the margin plus downward errors and one whole ulp on the other. The 0.500 setting splits the available room evenly, leaving ~0.08 ulp of assembly slack per side. Re-validated: check_ln_counterexample.py and check_ln_monotone.py pass; the Lean model regenerates and the full proof library rebuilds, including the seam/10**18 kernel decides and the whole-domain monotonicity theorems; test/0.8.34/Ln.t.sol passes with 4 vector expectations updated to the exact outputs (their floor references are unchanged); the Lean-model FFI fuzz suite passes 10/10; a 30k-sample high-precision floor-band check shows exact/minus-one split 15188/14812 with no violations. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <codex@openai.com>
LnProof/ExpSum.lean arithmetizes comparisons against the exponential
function using only Nat arithmetic, as groundwork for proving the floor
specification of `lnWad` (r = floor(L) or floor(L) - 1) without any
external dependency: an upper bound on e^(p/q) is a bound on every
partial sum S_N = sum_{j<=N} (p/q)^j / j! (the `capUB` relation), and a
lower bound is a single witnessed partial sum (`capLB`), the same way the
cbrt proof states its spec through r^3 <= x < (r+1)^3.
Contents, all kernel-checkable and sorry-free over `Init`:
- `expNum N p q = sum (N!/j!) p^j q^(N-j)`, the integer-scaled partial
sum, with a finite-sum library (`tsum`), a sum transpose, and the
box-into-triangle / triangle-into-box index inclusions;
- Pascal-recursive binomial coefficients, `cho_fact`, and the binomial
theorem, giving the two convolution coefficient identities;
- `prod_le_sum` and `sum_le_prod`: S_N(a) S_M(b) <= S_{N+M}(a+b) and
S_K(a+b) <= S_K(a) S_K(b), the surrogates for e^(a+b) = e^a e^b;
- monotonicity in N and in the argument, and a geometric tail bound
turning one evaluated partial sum into a bound on all of them
(`tail_bound`, via a decreasing potential);
- the cap algebra: `capUB`/`capLB` with product, power, quotient-mover,
argument-transport, and partial-sum-plus-tail introduction rules.
Axioms: propext and Quot.sound only.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Three more floor-specification layers, all sorry-free on standard axioms: - LnProof/ShiftCert.lean: an exact fuel-based Taylor shift (`polyShift`, kernel-computable structural recursion) with its evaluation lemma, and `checkCover`, which certifies `0 ≤ P(x)` for every integer `x` in a range by walking caller-supplied cell widths and recentering the polynomial at each cell (the certificate polynomials have ~1e-28 relative slack, which plain interval bisection cannot see past their huge monomials). Also `expPolyNum`, the polynomial-level partial-sum numerator with its evaluation lemma down to `expNum`, and crude box bounds (`polyHi`, `polyDiffHi`) for the bracket layer. - LnProof/FloorConsts.lean: kernel-checked two-sided exponential caps for the scaled `ln 2` and bias constants over the common denominator `10^27 2^99` (`cap2U/cap2L/capBU/capBL`, via `capUB_of_partial` so one evaluated partial sum carries the geometric tail), a lower cap for one output ulp (`capEL`), and the exact signed value of the `ln2 * k` word for every clz (`ln2k_exact`, 256-case kernel decide). - LnProof/FloorModel.lean: `model_floor_bracket` — for `x ≠ 10^18` the model output word `r` satisfies `r 2^72 ≤ V < (r+1) 2^72` with `V = X1 Kc + ln2k(clz x) + BIASc` the exact pre-shift accumulator, assembled from the existing transports, sandwiches, and mantissa facts. Also `capUB_weaken`/`capLB_weaken` target-transport rules in ExpSum. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…rackets
The bracket layer connecting the EVM pipeline to the floor-specification
exponential caps, built directly over the Stages coefficient lists:
- FloorCertDefs.lean: the bracket rationals for X1/2^99 as integer
polynomials in the mantissa, homogenized from PPc/QQc with denominators
8B^2 (at the wlo interval ends) and B^2 (at w*), stage truncation slops
entering exactly as SLOPPc (B^2)^4 and SLOPQc (8B^2)^5, and the
certificate polynomials (degree ~251, capUB_of_partial-shaped with the
per-point tail term, eps = 0.42e-27 both sides).
- FloorCert{GeUp,GeLo,LtUp,LtLo,Aux}.lean: kernel-checked `checkCover`
certificates; the four main inequalities need 14/13/11/13 recentred
cells, the ten auxiliary sign/positivity/tail-side-condition
certificates one cell each. Cell widths come from an offline walk and
are re-verified in-kernel; the Python generator cross-validates every
bracket rational against the EVM pipeline mirror on 300 samples.
- ShiftCert.lean: `homPoly`/`homEvalI` with the evaluation lemma and the
collapse identity `homEvalI cs (u d) d = d^deg P(u)`, which ties the
certificate polynomials to the Stages sandwiches with no coefficient
rescaling.
- FloorBracket.lean: exact division brackets for the z word on both
branches (`z_bracket_ge`/`z_bracket_lt`), and the two ordering lemmas
(`wlo_lt_un`, `un_le_dsq`) that place the pipeline u-hat between the
certificate interval ends.
Axioms: propext and Quot.sound only.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Groundwork for the divided-difference monotonicity of the bracket polynomials: rfl-checked quartic/quintic unfoldings of `homEvalI` at the `PPc`/`QQc` coefficient lists, nested-Horner distribution (`quartic_expand`), the difference-of-powers factorings (`df2/df3/df4`), and two-sided square monotonicity (`sq_le_of_abs_le`). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The four main certificate checks are split into ~3-cell `checkCover` groups with an interval-glue lemma per certificate, so the kernel frees each Taylor-shift cascade between declarations. Series depth reduced to K = 22 (tail 2.6e-33, far below the 0.08-ulp pads; cell counts unchanged at 14/13/11/13), with the factorial constants and powers in the certificate constructions updated to match. The GeUp kernel check runs within ~5.5 GB. Also `sq_nonneg'` and `mul_bound` (interval bound for products) in FloorBracket, feeding the divided-difference monotonicity lemmas. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <codex@openai.com>
homEvalI_PPc_anti and homEvalI_QQc_mono: the homogenized p-polynomial is decreasing and the q-polynomial increasing in the point argument over the box |n| ≤ Uc D. The divided difference of each column is factored (df2/df3/df4 and an asymmetric quintic factorization df5), every higher-order cofactor is crudely bounded through the box radius (squares via sq_le_of_abs_le, the quartic cofactor via an interval product bound, the quintic via an asymmetric split with a nonnegative y^4 remainder), and the linear coefficient dominates: the certificate constants PPc1 + 2 PPc2 Uc + 4 PPc4 Uc^3 < 0 QQc1 - 2|QQc2| Uc - 4|QQc4| Uc^3 - 4 Uc^4 > 0 are kernel-checked. These are the analytic facts that let the bracket cross-theorems compare pipeline stage values at the integer u-hat against the certificate polynomials' rational interval ends. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
At m = 2^104 - 1 the homogenized w* argument 2^96 A^2 slightly exceeds Uc B^2 (the u-hat domain bound does not apply to the exact w* point), so the monotonicity boxes now use the round bound 2.333e27, strictly above the supremum on both branches; all derived certificate constants are recomputed and the dominance decides still hold with their ~25x margins. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… rational bracket_ge_up: on the m ≥ S branch the pipeline quotient satisfies X1 · TD(m) ≤ TN(m) · 2^99 against the certificate bracket rationals. The chain runs: exact z and u-hat division brackets; the Stages sandwiches at u-hat (made opaque before every omega); the X1 sdiv bracket via evmSdiv_neg_neg; the wlo/u-hat/w* orderings; homEvalI collapse plus the divided-difference monotonicity lemmas comparing the stage values at u-hat against the certificate interval ends; and a multiplied chain that closes with exact equality at A · PHV · 2^486 · (B²)^5 before cancelling the positive multiplier. Also: certGeWS/certLtWS auxiliary certificates (2^96 A² ≤ UB B², one recentred cell each, kernel-checked) feeding the w*-side box bound, the certificate-side evaluation lemmas (evalTN_ge/evalTD_ge/...), and pow_pos/pow_nonneg helpers. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ional On the m >= S branch, TN2b(m) * 2^99 <= X1 * TD2b(m) for Sc+46 <= m < MHI. Mirrors bracket_ge_up with the lower division brackets (Nat.div_add_mod remainder side), the swapped divided-difference argument roles (P decreasing evaluated at the w* point, Q increasing at the wlo point), and the exact scale closure 2^442 * (8B^2)^5 = 2^457 * (B^2)^5. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Adds the lt-branch evaluation lemmas (A := S - m mirrors of the ge set) and bracket_lt_up: (-X1) * TD(m) <= TN(m) * 2^99 for MLO <= m <= Sc - 46. The branch uses z_bracket_lt (z = +q), the evmSdiv_pos_neg transport (nonnegative numerator over negative denominator), and the same divided-difference and collapse machinery as the ge side. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
TN2b(m) * 2^99 <= (-X1) * TD2b(m) for MLO <= m <= Sc - 46, completing the four cross-theorems that sandwich the pipeline X1 between the certificate rationals on both branches. All four check on propext + Classical.choice + Quot.sound only. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
certGeH/certLtH now certify 2*TN <= 24*TD, matching the (K+2)*q bound that capUB_of_partial requires at K = 22. The one-cell kernel checks pass unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Converts kernel-checked certificate nonnegativity into exponential caps at the pipeline value. Per branch: evaluation decompositions of the certificate polynomials, Int-to-Nat bridges at K = 22 (capUB via the tail bound with fact 23, capLB via the direct witness at fact 22), caps at the certificate rationals, X1 sign lemmas from the lower brackets, and capUB/capLB at (|X1| * 10^27) / (10^27 * 2^99) via the cross-theorems and capUB_arg/ capLB_arg. The main certificate facts enter as hypotheses so this file builds independently of the heavy cert decides. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
capLB_cancel mirrors capUB_cancel for moving a factor across the quotient on the lower side. The four budget families (upper/lower, binade shift k >= 0 over [0,151] and k < 0 over [1,103]) are certified by kernel evaluation over the whole range; they encode the weakening step from the multiplied caps to the x/10^18 targets, closed by the bias margin with ~7.7e-29 to spare. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
v_scale_pos/v_scale_neg split the pre-shift accumulator times 2^27 into the three cap exponents over QS = 10^27 * 2^99, with the ln2 term on the matching side of the equation for each clz regime. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <codex@openai.com>
synthDivM/polyShiftAuxM/checkCoverM bind recursive results through a match so the kernel computes each step once; checkCoverM_sound transfers soundness. The four branch files verify the certificate constructions against generated literal coefficient lists (FloorCertLit.lean, gitignored; regenerate with python3 formal/python/ln/gen_cert_literals.py -- the generator mirrors the Lean polynomial operations exactly, including homPoly's trailing zeros) and run the cell walks on the literals. The literal bridge keeps the checked certificate files tied to the generated coefficient lists without trusting the generator. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <codex@openai.com>
evalPoly_ext: two polynomials whose l1 norms are below 2^(B-1) agree everywhere once they agree at the single point 2^B (uniqueness of the balanced radix-2^B expansion, by induction with the bounded-multiple helper). polyL1 bounds propagate symbolically through polyAdd, polyScale, polyMulX, polyNeg, polyMul, polyPow, and expPolyNum (whose l1 recurrence is exactly LnExp.expNum). This lets the certificate-vs-literal equalities be proven by one closed integer comparison at 2^B plus a cheap symbolic bound, instead of evaluating the construction through the kernel coefficient by coefficient. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The cell checker (checkCoverK) computes each cell's shifted polynomial inside the decide as untrusted packed arithmetic - divide-and-conquer splitting with a precomputed square ladder for (x+a)^m, sign-split digit strings at B = 38000 (sized from the measured worst-case 37768 bits across all 57 cells), and bit-op digit extraction (shiftRight/land, which the kernel accelerates; generic big division dominates that path). Correctness needs no lemmas about the packed code: each witness is certified through evalPoly_ext by one evaluation identity at 2^B plus polyL1 bounds, with polyL1_polyShift <= aeval proven by induction over the synthetic-division remainder cascade. Each cell lives in its own module so lake builds them in parallel: one cell decides in ~7-10 s, and the full four-branch cold build is ~18 min wall / 44 min CPU on four cores, with branch eval_eq equalities at ~1 s and the remaining per-branch cost in the sub-literal equality decides. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <codex@openai.com>
up_ge_pos multiplies the X1 cap, the binade-shift power of the ln2 cap, and the bias cap; moves the exponent down to r * 2^99 through the floor bracket and v_scale_pos; and weakens to the x/10^18 target through budgetU_le and the mantissa window. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
budgetL_fold moves the worst-case lower budget from m = 2^103 to any mantissa via the cross-multiplication (m+1)*2^103 <= m*(2^103+1); lo_ge_pos multiplies the four lower caps (X1, ln2 power, bias, one output ulp), raises the exponent to (r+2)*2^99 through the strict side of the floor bracket, and weakens to the strict x target carrying one part in 10^30 of slack. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
up_ge_neg moves the ln2 power across the quotient with capUB_cancel (exponent split via truncated subtraction justified by V >= 0), brings the exponent down to r * 2^99, and weakens through budgetUn_le using the exact mantissa relation m = x * 2^(c-152). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
lo_ge_neg mirrors up_ge_neg with capLB_cancel; the V >= 0 side condition for the exponent split comes from the lower floor bracket, which the caller has from model_floor_bracket. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
up_lt_pos moves the X1 magnitude across the quotient with capUB_cancel against the lower lt-branch cap, mirroring the structure of the ge-side masters. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Refine both certificate granularities to push the margin to the hard analytic floor (the rational's ±0.34-ulp worst-case error, below which the implementation would overshoot): - Certificate tolerance EUD: 10^29 -> 10^31, EUN=3401 (ε=0.3401 ulp), just above the 0.340-ulp cover floor where the cert touches zero. - Cap denominator: 10^30 -> 10^31, so the margin resolves in 1e-4-ulp steps; X_BU=3402, X_BL=3404 give M=0.3403 ulp. lnWadToRay now lands on exactly floor(L) for ~66% of inputs (the analytic ceiling), up from ~64% at the prior M=0.362. Cell count stays flat (59). Scale-shift across FloorBudget/FloorConsts/FloorAssembly/FloorCaps/ FloorWindow/FloorSpec/FloorCertDefs (10^29->10^31, 10^30->10^31, packed exponents 10^77->10^80 and 10^137->10^142, capEL 10^30+999->10^31+9990, strictness 10^30-1->10^31-10), bias in Stages/FloorModel/FloorSpec/ OctaveMono, Ln.sol _BIAS + NatSpec, and 59 regenerated cell modules. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The generator imports check_ln_counterexample as a package module, so it prepends the repo root to sys.path; derive that from __file__ (parents[3]) like the other model generators rather than a fixed absolute path, so `python3 formal/python/ln/gen_cert_literals.py` runs from any checkout (CI was failing with ModuleNotFoundError: No module named 'formal'). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
- Bridge: drop the unused ipow358/ipow386 helpers (dead code, and the only literals with exponent >256 in the file). - FloorBracket: raise exponentiation.threshold to 512 so the bracket polynomials' opaque 2^k factors (up to 2^486) are evaluated during elaboration rather than left symbolic with a warning; the proofs treat them as opaque integers, so evaluation does not change them. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Each `*_eval_eq` proof spent ~570s (the bulk of an ~11-min/branch build that timed out CI) in kernel type-checking, despite all three `decide +kernel` calls together taking under 6s. Profiling isolated two algorithmic costs: 1. Goal 1 (the `polyL1 cert * 2 < 2^kB` bound) opened with `unfold certX; rw [..._eq_lit]`. The `unfold` tactic forces the kernel to reduce the degree-276 construction as a coefficient list (~565s). Replace it with the ℓ1 homomorphism `have`-chain applied to the literal summands and close by `exact Nat.lt_of_le_of_lt (Nat.mul_le_mul_right 2 (Nat.le_trans h1 (Nat.add_le_add hA hB))) hfin`, which discharges the goal through a lazy-congruence defeq bottoming out at `geTD ≡ geTDLit` (degree 12, milliseconds). No `unfold`, no `rw` of the construction. 2. Goal 3 (the Kronecker eval-identity) evaluated at `(2 : Int) ^ kB`, whose `Monoid.npow` is a non-accelerated linear chain recomputed at every use (~100s). The new `LnProof.int_two_pow` rewrites it to `((2 ^ kB : Nat) : Int)` so the base goes through the kernel's GMP-backed `Nat.pow` (~5s). Per-branch build drops from ~11 min to under 25s; the axiom footprint is unchanged ([propext, Classical.choice, Quot.sound]). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
7a2bfc7 to
12f832e
Compare
The mixed fixed-point staircase carried surplus precision: six coefficients sat one byte above the minimum needed to satisfy the floor specification. This re-quantizes the same optimal (4,5) weighted-minimax rational at the byte- minimal bases (each at the widest basis fitting its minimal PUSH width), riding the edge of the never-overshoot guarantee. p: Q68,Q80,Q86,Q93,Q94 -> Q68,Q72,Q78,Q85,Q94 q: Q96,Q87,Q85,Q93,Q94 -> Q96,Q71,Q77,Q85,Q94 Coefficient literals drop from 108 to 101 bytes (P3,P2,P1,Q3,Q2,Q1 shrink by a byte each; Q3 by two). The rational, the centering constant s, the bias and its margin, ln(2) and the ray rescale are unchanged. The weighted sup-norm error rises from 0.326 to 0.334 ulp -- still under the 0.3403-ulp margin -- so the result remains floor(L) or floor(L)-1, monotonic, with lnWadToRay(1e18)=0. Verified by an exact integer fixed-point model of the Horner/SDIV evaluation (sup error <= margin and strictly antitone quotient over the domain) and end-to- end against floor(1e27*ln(x/1e18)) at the domain extremes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The centering constant s = round(√2·2^W) is a PUSHed literal whose width tracks the mantissa basis W. W=103 made s a 104-bit (13-byte) literal; the never- overshoot floor only requires W ≥ 92 (the mantissa truncation is a downward-only error < 2^-W·10^27, and margin + minimax + truncations + m_trunc must stay below 1 ulp). W=95 is the widest basis whose s fits in 12 bytes, so s drops from 0xb504f333f9de6484597d89b375 to 0xb504f333f9de6484597d89b3 -- one byte. The mantissa truncation rises to 2^-95·10^27 ≈ 0.026 ulp (still far under the 0.3403 margin), lowering the exact-⌊L⌋ hit rate ~2% but never violating the floor spec. The extraction shift becomes 0xa0 (=256-96), k ∈ [-95, 159], and the bias is recomputed for the new s. Coefficients, rational, ln(2) and the margin are unchanged. Within an octave z = sdiv((s-m)·2^100, m+s) now steps by 8 to 16 per unit of m (|∂z/∂m| ∈ [8,16] vs <1 at Q103); monotonicity still holds because z stays strictly decreasing in m and the quotient is antitone in z, telescoping across the multi-unit step. Verified end-to-end against floor(1e27·ln(x/1e18)) and for monotonicity at the domain extremes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The within-octave monotonicity proof bounds the per-z-step descent of the quotient by Rₘᵢₙ - zₘₐₓ·2J, where the truncation jitter J grows with each Horner stage's renormalizing shift. Only p₁ (Q93→Q85) and q₃ (Q87→Q79) can drop a byte while keeping that step margin positive (0.296 quotient units); the other coefficients stay at bases where shortening them would drive the margin below zero. Monotonicity -- not the rounding budget -- is the binding constraint on these two stages. Verified by check_ln_monotone (step_margin 0.296, all 254 clz seams) and check_ln_counterexample (witness floors correctly). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Update the Lean proof to match src/vendor/Ln.sol after the Q95 mantissa and the p1@Q85 / q3@Q79 coefficient bases: the stage sandwich constants, cert exponents (PPc p1 at 2^271, QQc q3 at 2^113) and truncation slops, the binade-shift width (160 - clz), the centering constant Sc = round(√2·2^95), the bias, and the certificate cell cover are all regenerated for the new constants. The mantissa now steps z by 8-16 per unit of m; monotonicity telescopes that through the per-z-step antitone quotient unchanged. model_ln_wad_floor, model_ln_wad_to_wad_floor, model_ln_wad_mono and model_ln_wad_to_wad_mono build against the regenerated model and depend only on [propext, Classical.choice, Quot.sound]. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
evmSar_sandwich_89/_98/_105 and evmShr_eq_div_152 were the divisors for the old p3/q3/q2 stage shifts and the Q103 mantissa extraction; the Q95 build uses 97/90/113 and 160 instead, leaving these four unreferenced. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The constant term c0 and the q coefficients Q4, Q1 shipped values that were a few ulp off the weighted-minimax quantization their first-principles derivation yields (c0 +14, Q4 +2, Q1 +14 in the low byte). Move them to the exact minimax quantization and resync the floor proof: the Stages stage-bound literals and suffix polynomials for those coefficients, the regenerated certificate, the re-run cell cover, and FloorBracket's divided-difference combined coefficients. Bases, shifts, slops, the exponent staircase, and the monotonicity argument are unchanged. model_ln_wad_floor / _to_wad_floor / _mono / _to_wad_mono build and depend only on [propext, Classical.choice, Quot.sound]; check_ln_counterexample and check_ln_monotone pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The x=10^18 correction now detects the floored output (`add(iszero(not(r)), r)`, i.e. `r + (r == -1)`) instead of the input (`eq(x, 10^18)`); the two agree because the floored result is -1 exactly at x = 10^18. Regenerate the model and update the proof to match: - OctaveMono: `lnTail` folds in the self-correction `s + (s == -1)`; add `corr_eq`/`corr_toInt`/`corr_mono` (the map sends -1 to 0 and is order-preserving); `affine_tail_mono` bounds the bare `sar` and `tail_mono` composes it with `corr_mono`. - TopMono: `lnTail_bound`/`model_bound` drop the `one` parameter. - FloorModel: `model_ne_zero` (the corrected model is nonzero off 10^18) lets `model_floor_bracket` peel the correction (`r != -1` there). - FloorSpec: generalize the model word in the case analysis, and mark the model defs `local irreducible` so the linter does not whnf the (doubled) term. - Remove the now-unused `evmEq_zero`/`evmEq_le_one`/`evmEq_comm`. Update the Python mirrors (`check_ln_counterexample`, `check_ln_monotone`) to the same correction. Full `lake build` is clean (93 jobs); the four top theorems depend only on [propext, Classical.choice, Quot.sound]; both Python checks pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🛡️ Immunefi PR ReviewsWe noticed that your project isn't set up for automatic code reviews. If you'd like this PR reviewed by the Immunefi team, you can request it manually using the link below: Once submitted, we'll take care of assigning a reviewer and follow up here. |
e1Ru1o
reviewed
Jun 26, 2026
Member
There was a problem hiding this comment.
Consider also adding a diff test for the python implementation in check_ln_counterexample.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.