Skip to content

Add formally verified natural logarithm#585

Open
duncancmt wants to merge 94 commits into
dcmt/codex-prove-sqrt-cbrtfrom
dcmt/ln-atarpara
Open

Add formally verified natural logarithm#585
duncancmt wants to merge 94 commits into
dcmt/codex-prove-sqrt-cbrtfrom
dcmt/ln-atarpara

Conversation

@duncancmt

Copy link
Copy Markdown
Collaborator

No description provided.

duncancmt and others added 30 commits June 8, 2026 11:56
`Ln.lnWad` returns `floor(L)` or `floor(L) - 1`, never more, where
L = 10**18 * ln(x / 10**18) is the exact result; it reverts with
`LnWadUndefined()` for x <= 0.

The atanh substitution is centered at s = round(sqrt(2) * 2**96), so
|z| <= 3 - 2*sqrt(2) and a (3,3)-degree rational minimax with Q96
coefficients (certified sup-norm error 2.532e-19) suffices. ln(s) folds
into the constant term, which is biased down by a 3e22 margin: certified
upward errors total <= 2.63e22, and margin plus downward errors stay
below the final ulp of 2**78, so the Q78 accumulator always lies in
((L - 1) * 2**78, L * 2**78] and `sar(78)` lands on floor(L) or
floor(L) - 1.

Gas: ~304 per call, measured through an external wrapper net of call
overhead.

Verified on 718k+ structured, adversarial, and random inputs against
1500-bit references using an exact EVM-semantics mirror of the
implementation (formal/python/ln/check_ln_counterexample.py), with
pinned-vector, revert, and fuzz coverage in test/0.8.34/Ln.t.sol.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
`Ln.lnWad` takes a wad (10**18 basis) input and returns
`floor(L)` or `floor(L) - 1` for L = 10**27 * ln(x / 10**18), never
more; it reverts with `LnWadUndefined()` for x <= 0.

The ray ulp (1e-27 of ln) requires ~96 significant bits, so the
rational is a (4,5)-degree minimax of atanh(sqrt(u))/sqrt(u) fit under
the sqrt(u) weight the error carries into ln, with q monic and
p(0) = q(0) constrained so the constant-term literal is shared.
Certified weighted sup error of the integer-rounded rational:
0.325 ulp.

Every quantity lives at the fixed-point basis its consumer needs and
is rounded exactly once: the mantissa at Q103, z at Q100, u at Q96,
and the Horner stages on a basis staircase (p: Q68/Q72/Q78/Q85/Q94;
q: Q96/Q71/Q77/Q85/Q94) whose sar immediates absorb every basis
change. A coefficient followed by j more multiplies by u tolerates
~5.1 fewer fractional bits per unit of j; each literal takes the
widest basis that fits its minimal PUSH width, so quantization stays
negligible at the minimal code size. The quotient p*z/q is taken once
at Q100 and rescaled by exactly 5**27 onto the 10**27 * 2**72 output
grid shared by the k*ln(2) term and the bias, leaving the closing
`sar(72)` as the single output rounding.

Certified budget: upward error 0.343 ulp <= 1.7e21-unit margin
(0.360 ulp); margin plus downward errors 0.703 ulp < 1, so the
accumulator always lies in ((L-1)*2**72, L*2**72].

Gas: ~351 per call measured through an external wrapper net of call
overhead; wrapper code 437 bytes (the uniform-basis layout of the
same rational measures ~362 gas and 463 bytes).

Verified on 718k+ structured, adversarial, and random inputs against
1600-bit references using the exact EVM-semantics mirror in
formal/python/ln/check_ln_counterexample.py, with pinned-vector,
revert, and fuzz coverage in test/0.8.34/Ln.t.sol.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
`Ln.lnWad` now returns exactly 0 for x = 10**18 -- the one input whose
exact result is an integer -- via a branchless `eq` correction added to
the floored accumulator (the uncorrected accumulator deterministically
floors to -1 there; pinned by test, by the Python mirror, and by the
Lean theorem `model_ln_wad_one_wad`). `Ln.lnWadToWad` wraps `lnWad`
with floor division by 10**9 (truncating `sdiv` alone would mis-round
negative results), inheriting the floor-or-floor-minus-1 contract,
monotonicity, and the zero point at wad basis.

`lnWad` is monotone: x1 < x2 implies lnWad(x1) <= lnWad(x2). Ensuring
this required raising the Horner stage truncation bases (p: Q68/Q80/
Q86/Q93/Q94, q: Q96/Q87/Q85/Q93/Q94, +7 literal bytes, op count
unchanged): the within-octave step inequality needs the stage
truncation jitter bound z_max*2J < R_min, which these stage bases satisfy with the certificate margin.

Verification coverage:

- Proven in Lean (formal/ln/LnProof, kernel `decide`, zero axioms,
  against a model generated from the Yul IR by
  formal/python/ln/generate_ln_model.py): monotonicity across all 254
  clz seams for both functions, lnWad(10**18) = 0 and
  lnWadToWad(10**18) = 0, and order at the corrected point. The model
  generator strips the `LnWadUndefined()` revert guard (the memory
  model requires straight-line stores); all theorems quantify over the
  non-reverting domain.
- Covered by exact rational arithmetic in formal/python/ln/check_ln_monotone.py:
  the within-octave leg of the monotonicity argument, using integer
  interval propagation of the Horner stages, truncation-jitter bound
  with step margin 0.829, polynomial-derivative positivity by exact
  interval bisection, and mantissa-to-z antitonicity.
- Differentially tested: the Lean model evaluator (`ln-model`) against
  the deployed Solidity for 2x20k fuzz runs plus the full vector suite
  via FFI (test/0.8.25/formal-model/LnModel.t.sol); the Python mirror
  against 1600-bit references on 718k+ structured, adversarial, and
  random inputs; 2.4M adjacent-mantissa pairs and all seams scanned
  with no monotonicity violation.

`sdiv`, `sar`, `slt`, and `sgt` are added to the modeled builtins in
formal/python/evm_builtins.py with two's-complement Lean semantics.
.github/workflows/ln-formal.yml runs the certificates, regenerates the
model, builds the proofs, diffs the axiom report, and runs the FFI
fuzz on every change to the relevant files.

Gas: ~377 per `lnWad` call, ~376 per `lnWadToWad` call, measured
through external wrappers net of call overhead.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
The monotonicity argument is now machine-checked end to end against the
generated model: `model_ln_wad_mono` proves that for all 0 < x <= y <
2**255, the signed output words satisfy lnWad(x) <= lnWad(y), and
`model_ln_wad_to_wad_mono` carries the result through the wad helper's
floor division. Both depend only on the standard Lean axioms (propext,
Classical.choice, Quot.sound).

The proof chain:

- Poly.lean: dense integer polynomials, interval-Horner evaluation, and a
  fuel-bounded adaptive bisection checker whose soundness theorem turns a
  kernel-`decide`d `true` into nonnegativity over an integer interval.
- Bridge.lean/BridgeDiv.lean: two's-complement transport lemmas taking
  each modeled EVM opcode (add/sub/mul/shl/shr/sar/sdiv) to plain `Int`
  arithmetic under explicit range hypotheses.
- Stages.lean: each Horner stage of the p/q evaluation sandwiched against
  an exact integer polynomial (scales 2**358 and 2**386) with one-sided
  truncation slop.
- Certs.lean: the four analytic certificates (p-range, q-range, and the
  two step inequalities G1/G2 covering positive and nonpositive z) checked
  by the bisection checker via kernel `decide`.
- StepMono.lean: the within-octave unit step -- the truncated quotient
  pipeline X1 = sdiv(p(u)*z, q(u)) is antitone over consecutive integer
  z-values in [-Zc, Zc].
- ZOctave.lean: the mantissa-to-z map is antitone with unit steps, z stays
  in [-Zc, Zc], and the chained X1-vs-mantissa monotonicity and magnitude
  bound.
- OctaveMono.lean: the generated model re-expressed through the
  (eq, exponent, mantissa) triple, the ln2*k term bracketed by kernel
  evaluation over all 256 clz values, and the fixed-exponent affine tail
  (multiply by K, add ln2*k and the bias, sar 72) shown monotone.
- TopMono.lean: clz/log2 plumbing, extraction of the decided seam and
  10**18 theorems, the unit step over the full domain, the induction that
  chains it, and the floor-division corollary for lnWadToWad.

The CI axiom check now imports LnProof.TopMono and pins the axiom sets of
the two monotonicity theorems. Ln.sol's NatSpec and the Python certificate's
docstring now describe the Lean proof as the authority, with
check_ln_monotone.py as its executable exact-rational counterpart.

Verified: lake build (all 14 jobs, no warnings in the changed files); the CI
axiom-check diff matches; check_ln_counterexample.py and
check_ln_monotone.py pass; the Lean-model FFI fuzz suite passes 10/10;
test/0.8.34/Ln.t.sol passes 8/8; forge fmt --check clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
Set the one-sided bias margin to exactly 0.500 ulp by changing only
the BIAS constant (same 21-byte literal width; gas and code size
unchanged). The output contract is unchanged: the result is still
floor(L) or floor(L) - 1, never an overestimate, and lnWad(10**18) == 0
still lands via the branchless correction (the uncorrected accumulator at
10**18 is exactly -1).

The certified upward error of the approximation pipeline is 0.329 ulp and
the analytic legs of the floor specification need the gap between that
error and the margin on one side, and between the margin plus downward
errors and one whole ulp on the other. The 0.500 setting splits the available room evenly, leaving ~0.08 ulp of
assembly slack per side.

Re-validated: check_ln_counterexample.py and check_ln_monotone.py pass;
the Lean model regenerates and the full proof library rebuilds, including
the seam/10**18 kernel decides and the whole-domain monotonicity
theorems; test/0.8.34/Ln.t.sol passes with 4 vector expectations updated
to the exact outputs (their floor references are unchanged); the
Lean-model FFI fuzz suite passes 10/10; a 30k-sample high-precision
floor-band check shows exact/minus-one split 15188/14812 with no
violations.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
LnProof/ExpSum.lean arithmetizes comparisons against the exponential
function using only Nat arithmetic, as groundwork for proving the floor
specification of `lnWad` (r = floor(L) or floor(L) - 1) without any
external dependency: an upper bound on e^(p/q) is a bound on every
partial sum S_N = sum_{j<=N} (p/q)^j / j! (the `capUB` relation), and a
lower bound is a single witnessed partial sum (`capLB`), the same way the
cbrt proof states its spec through r^3 <= x < (r+1)^3.

Contents, all kernel-checkable and sorry-free over `Init`:

- `expNum N p q = sum (N!/j!) p^j q^(N-j)`, the integer-scaled partial
  sum, with a finite-sum library (`tsum`), a sum transpose, and the
  box-into-triangle / triangle-into-box index inclusions;
- Pascal-recursive binomial coefficients, `cho_fact`, and the binomial
  theorem, giving the two convolution coefficient identities;
- `prod_le_sum` and `sum_le_prod`: S_N(a) S_M(b) <= S_{N+M}(a+b) and
  S_K(a+b) <= S_K(a) S_K(b), the surrogates for e^(a+b) = e^a e^b;
- monotonicity in N and in the argument, and a geometric tail bound
  turning one evaluated partial sum into a bound on all of them
  (`tail_bound`, via a decreasing potential);
- the cap algebra: `capUB`/`capLB` with product, power, quotient-mover,
  argument-transport, and partial-sum-plus-tail introduction rules.

Axioms: propext and Quot.sound only.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Three more floor-specification layers, all sorry-free on standard axioms:

- LnProof/ShiftCert.lean: an exact fuel-based Taylor shift
  (`polyShift`, kernel-computable structural recursion) with its
  evaluation lemma, and `checkCover`, which certifies `0 ≤ P(x)` for
  every integer `x` in a range by walking caller-supplied cell widths and
  recentering the polynomial at each cell (the certificate polynomials
  have ~1e-28 relative slack, which plain interval bisection cannot see
  past their huge monomials). Also `expPolyNum`, the polynomial-level
  partial-sum numerator with its evaluation lemma down to `expNum`, and
  crude box bounds (`polyHi`, `polyDiffHi`) for the bracket layer.

- LnProof/FloorConsts.lean: kernel-checked two-sided exponential caps for
  the scaled `ln 2` and bias constants over the common denominator
  `10^27 2^99` (`cap2U/cap2L/capBU/capBL`, via `capUB_of_partial` so one
  evaluated partial sum carries the geometric tail), a lower cap for one
  output ulp (`capEL`), and the exact signed value of the `ln2 * k` word
  for every clz (`ln2k_exact`, 256-case kernel decide).

- LnProof/FloorModel.lean: `model_floor_bracket` — for `x ≠ 10^18` the
  model output word `r` satisfies `r 2^72 ≤ V < (r+1) 2^72` with
  `V = X1 Kc + ln2k(clz x) + BIASc` the exact pre-shift accumulator,
  assembled from the existing transports, sandwiches, and mantissa facts.

Also `capUB_weaken`/`capLB_weaken` target-transport rules in ExpSum.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…rackets

The bracket layer connecting the EVM pipeline to the floor-specification
exponential caps, built directly over the Stages coefficient lists:

- FloorCertDefs.lean: the bracket rationals for X1/2^99 as integer
  polynomials in the mantissa, homogenized from PPc/QQc with denominators
  8B^2 (at the wlo interval ends) and B^2 (at w*), stage truncation slops
  entering exactly as SLOPPc (B^2)^4 and SLOPQc (8B^2)^5, and the
  certificate polynomials (degree ~251, capUB_of_partial-shaped with the
  per-point tail term, eps = 0.42e-27 both sides).

- FloorCert{GeUp,GeLo,LtUp,LtLo,Aux}.lean: kernel-checked `checkCover`
  certificates; the four main inequalities need 14/13/11/13 recentred
  cells, the ten auxiliary sign/positivity/tail-side-condition
  certificates one cell each. Cell widths come from an offline walk and
  are re-verified in-kernel; the Python generator cross-validates every
  bracket rational against the EVM pipeline mirror on 300 samples.

- ShiftCert.lean: `homPoly`/`homEvalI` with the evaluation lemma and the
  collapse identity `homEvalI cs (u d) d = d^deg P(u)`, which ties the
  certificate polynomials to the Stages sandwiches with no coefficient
  rescaling.

- FloorBracket.lean: exact division brackets for the z word on both
  branches (`z_bracket_ge`/`z_bracket_lt`), and the two ordering lemmas
  (`wlo_lt_un`, `un_le_dsq`) that place the pipeline u-hat between the
  certificate interval ends.

Axioms: propext and Quot.sound only.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Groundwork for the divided-difference monotonicity of the bracket
polynomials: rfl-checked quartic/quintic unfoldings of `homEvalI` at the
`PPc`/`QQc` coefficient lists, nested-Horner distribution
(`quartic_expand`), the difference-of-powers factorings (`df2/df3/df4`),
and two-sided square monotonicity (`sq_le_of_abs_le`).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The four main certificate checks are split into ~3-cell `checkCover`
groups with an interval-glue lemma per certificate, so the kernel frees
each Taylor-shift cascade between declarations. Series depth reduced to K = 22 (tail 2.6e-33, far
below the 0.08-ulp pads; cell counts unchanged at 14/13/11/13), with the
factorial constants and powers in the certificate constructions updated
to match. The GeUp kernel check runs within ~5.5 GB.

Also `sq_nonneg'` and `mul_bound` (interval bound for products) in
FloorBracket, feeding the divided-difference monotonicity lemmas.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
homEvalI_PPc_anti and homEvalI_QQc_mono: the homogenized p-polynomial is
decreasing and the q-polynomial increasing in the point argument over the
box |n| ≤ Uc D. The divided difference of each column is factored
(df2/df3/df4 and an asymmetric quintic factorization df5), every
higher-order cofactor is crudely bounded through the box radius
(squares via sq_le_of_abs_le, the quartic cofactor via an interval
product bound, the quintic via an asymmetric split with a nonnegative
y^4 remainder), and the linear coefficient dominates: the certificate
constants

  PPc1 + 2 PPc2 Uc + 4 PPc4 Uc^3 < 0
  QQc1 - 2|QQc2| Uc - 4|QQc4| Uc^3 - 4 Uc^4 > 0

are kernel-checked. These are the analytic facts that let the bracket
cross-theorems compare pipeline stage values at the integer u-hat against
the certificate polynomials' rational interval ends.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
At m = 2^104 - 1 the homogenized w* argument 2^96 A^2 slightly exceeds
Uc B^2 (the u-hat domain bound does not apply to the exact w* point), so
the monotonicity boxes now use the round bound 2.333e27, strictly above
the supremum on both branches; all derived certificate constants are
recomputed and the dominance decides still hold with their ~25x margins.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… rational

bracket_ge_up: on the m ≥ S branch the pipeline quotient satisfies
X1 · TD(m) ≤ TN(m) · 2^99 against the certificate bracket rationals. The
chain runs: exact z and u-hat division brackets; the Stages sandwiches at
u-hat (made opaque before every omega); the X1 sdiv bracket via
evmSdiv_neg_neg; the wlo/u-hat/w* orderings; homEvalI collapse plus the
divided-difference monotonicity lemmas comparing the stage values at
u-hat against the certificate interval ends; and a multiplied chain that
closes with exact equality at A · PHV · 2^486 · (B²)^5 before cancelling
the positive multiplier.

Also: certGeWS/certLtWS auxiliary certificates (2^96 A² ≤ UB B², one
recentred cell each, kernel-checked) feeding the w*-side box bound, the
certificate-side evaluation lemmas (evalTN_ge/evalTD_ge/...), and
pow_pos/pow_nonneg helpers.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ional

On the m >= S branch, TN2b(m) * 2^99 <= X1 * TD2b(m) for Sc+46 <= m < MHI.
Mirrors bracket_ge_up with the lower division brackets (Nat.div_add_mod
remainder side), the swapped divided-difference argument roles (P decreasing
evaluated at the w* point, Q increasing at the wlo point), and the exact
scale closure 2^442 * (8B^2)^5 = 2^457 * (B^2)^5.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Adds the lt-branch evaluation lemmas (A := S - m mirrors of the ge set)
and bracket_lt_up: (-X1) * TD(m) <= TN(m) * 2^99 for MLO <= m <= Sc - 46.
The branch uses z_bracket_lt (z = +q), the evmSdiv_pos_neg transport
(nonnegative numerator over negative denominator), and the same
divided-difference and collapse machinery as the ge side.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
TN2b(m) * 2^99 <= (-X1) * TD2b(m) for MLO <= m <= Sc - 46, completing the
four cross-theorems that sandwich the pipeline X1 between the certificate
rationals on both branches. All four check on propext + Classical.choice +
Quot.sound only.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
certGeH/certLtH now certify 2*TN <= 24*TD, matching the (K+2)*q bound
that capUB_of_partial requires at K = 22. The one-cell kernel checks
pass unchanged.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Converts kernel-checked certificate nonnegativity into exponential caps at
the pipeline value. Per branch: evaluation decompositions of the certificate
polynomials, Int-to-Nat bridges at K = 22 (capUB via the tail bound with
fact 23, capLB via the direct witness at fact 22), caps at the certificate
rationals, X1 sign lemmas from the lower brackets, and capUB/capLB at
(|X1| * 10^27) / (10^27 * 2^99) via the cross-theorems and capUB_arg/
capLB_arg. The main certificate facts enter as hypotheses so this file
builds independently of the heavy cert decides.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
capLB_cancel mirrors capUB_cancel for moving a factor across the
quotient on the lower side. The four budget families (upper/lower,
binade shift k >= 0 over [0,151] and k < 0 over [1,103]) are certified
by kernel evaluation over the whole range; they encode the weakening
step from the multiplied caps to the x/10^18 targets, closed by the
bias margin with ~7.7e-29 to spare.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
v_scale_pos/v_scale_neg split the pre-shift accumulator times 2^27 into
the three cap exponents over QS = 10^27 * 2^99, with the ln2 term on the
matching side of the equation for each clz regime.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
synthDivM/polyShiftAuxM/checkCoverM bind recursive results through a
match so the kernel computes each step once; checkCoverM_sound transfers
soundness. The four branch files verify the certificate constructions
against generated literal coefficient lists (FloorCertLit.lean,
gitignored; regenerate with python3 formal/python/ln/gen_cert_literals.py
-- the generator mirrors the Lean polynomial operations exactly,
including homPoly's trailing zeros) and run the cell walks on the
literals. The literal bridge keeps the checked certificate files tied to the
generated coefficient lists without trusting the generator.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
evalPoly_ext: two polynomials whose l1 norms are below 2^(B-1) agree
everywhere once they agree at the single point 2^B (uniqueness of the
balanced radix-2^B expansion, by induction with the bounded-multiple
helper). polyL1 bounds propagate symbolically through polyAdd, polyScale,
polyMulX, polyNeg, polyMul, polyPow, and expPolyNum (whose l1 recurrence
is exactly LnExp.expNum). This lets the certificate-vs-literal
equalities be proven by one closed integer comparison at 2^B plus a
cheap symbolic bound, instead of evaluating the construction through
the kernel coefficient by coefficient.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The cell checker (checkCoverK) computes each cell's shifted polynomial
inside the decide as untrusted packed arithmetic - divide-and-conquer
splitting with a precomputed square ladder for (x+a)^m, sign-split
digit strings at B = 38000 (sized from the measured worst-case 37768
bits across all 57 cells), and bit-op digit extraction (shiftRight/land,
which the kernel accelerates; generic big division dominates that path).
Correctness needs no lemmas about the packed code: each witness is
certified through evalPoly_ext by one evaluation identity at 2^B plus
polyL1 bounds, with polyL1_polyShift <= aeval proven by induction over
the synthetic-division remainder cascade.

Each cell lives in its own module so lake builds them in parallel:
one cell decides in ~7-10 s, and the full four-branch cold build is
~18 min wall / 44 min CPU on four cores, with branch eval_eq equalities
at ~1 s and the remaining per-branch cost in the sub-literal equality
decides.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
up_ge_pos multiplies the X1 cap, the binade-shift power of the ln2 cap,
and the bias cap; moves the exponent down to r * 2^99 through the floor
bracket and v_scale_pos; and weakens to the x/10^18 target through
budgetU_le and the mantissa window.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
budgetL_fold moves the worst-case lower budget from m = 2^103 to any
mantissa via the cross-multiplication (m+1)*2^103 <= m*(2^103+1);
lo_ge_pos multiplies the four lower caps (X1, ln2 power, bias, one
output ulp), raises the exponent to (r+2)*2^99 through the strict side
of the floor bracket, and weakens to the strict x target carrying one
part in 10^30 of slack.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
up_ge_neg moves the ln2 power across the quotient with capUB_cancel
(exponent split via truncated subtraction justified by V >= 0), brings
the exponent down to r * 2^99, and weakens through budgetUn_le using the
exact mantissa relation m = x * 2^(c-152).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
lo_ge_neg mirrors up_ge_neg with capLB_cancel; the V >= 0 side condition
for the exponent split comes from the lower floor bracket, which the
caller has from model_floor_bracket.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
up_lt_pos moves the X1 magnitude across the quotient with capUB_cancel
against the lower lt-branch cap, mirroring the structure of the ge-side
masters.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
duncancmt and others added 14 commits June 14, 2026 20:50
Refine both certificate granularities to push the margin to the hard
analytic floor (the rational's ±0.34-ulp worst-case error, below which the
implementation would overshoot):

- Certificate tolerance EUD: 10^29 -> 10^31, EUN=3401 (ε=0.3401 ulp), just
  above the 0.340-ulp cover floor where the cert touches zero.
- Cap denominator: 10^30 -> 10^31, so the margin resolves in 1e-4-ulp steps;
  X_BU=3402, X_BL=3404 give M=0.3403 ulp.

lnWadToRay now lands on exactly floor(L) for ~66% of inputs (the analytic
ceiling), up from ~64% at the prior M=0.362. Cell count stays flat (59).

Scale-shift across FloorBudget/FloorConsts/FloorAssembly/FloorCaps/
FloorWindow/FloorSpec/FloorCertDefs (10^29->10^31, 10^30->10^31, packed
exponents 10^77->10^80 and 10^137->10^142, capEL 10^30+999->10^31+9990,
strictness 10^30-1->10^31-10), bias in Stages/FloorModel/FloorSpec/
OctaveMono, Ln.sol _BIAS + NatSpec, and 59 regenerated cell modules.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The generator imports check_ln_counterexample as a package module, so it
prepends the repo root to sys.path; derive that from __file__ (parents[3])
like the other model generators rather than a fixed absolute path, so
`python3 formal/python/ln/gen_cert_literals.py` runs from any checkout (CI
was failing with ModuleNotFoundError: No module named 'formal').

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
- Bridge: drop the unused ipow358/ipow386 helpers (dead code, and the only
  literals with exponent >256 in the file).
- FloorBracket: raise exponentiation.threshold to 512 so the bracket
  polynomials' opaque 2^k factors (up to 2^486) are evaluated during
  elaboration rather than left symbolic with a warning; the proofs treat
  them as opaque integers, so evaluation does not change them.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Each `*_eval_eq` proof spent ~570s (the bulk of an ~11-min/branch build that
timed out CI) in kernel type-checking, despite all three `decide +kernel`
calls together taking under 6s. Profiling isolated two algorithmic costs:

1. Goal 1 (the `polyL1 cert * 2 < 2^kB` bound) opened with
   `unfold certX; rw [..._eq_lit]`. The `unfold` tactic forces the kernel to
   reduce the degree-276 construction as a coefficient list (~565s). Replace it
   with the ℓ1 homomorphism `have`-chain applied to the literal summands and
   close by `exact Nat.lt_of_le_of_lt (Nat.mul_le_mul_right 2
   (Nat.le_trans h1 (Nat.add_le_add hA hB))) hfin`, which discharges the goal
   through a lazy-congruence defeq bottoming out at `geTD ≡ geTDLit` (degree 12,
   milliseconds). No `unfold`, no `rw` of the construction.

2. Goal 3 (the Kronecker eval-identity) evaluated at `(2 : Int) ^ kB`, whose
   `Monoid.npow` is a non-accelerated linear chain recomputed at every use
   (~100s). The new `LnProof.int_two_pow` rewrites it to `((2 ^ kB : Nat) : Int)`
   so the base goes through the kernel's GMP-backed `Nat.pow` (~5s).

Per-branch build drops from ~11 min to under 25s; the axiom footprint is
unchanged ([propext, Classical.choice, Quot.sound]).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
Co-Authored-By: OpenAI Codex <codex@openai.com>
duncancmt and others added 10 commits June 15, 2026 12:14
The mixed fixed-point staircase carried surplus precision: six coefficients
sat one byte above the minimum needed to satisfy the floor specification. This
re-quantizes the same optimal (4,5) weighted-minimax rational at the byte-
minimal bases (each at the widest basis fitting its minimal PUSH width), riding
the edge of the never-overshoot guarantee.

  p: Q68,Q80,Q86,Q93,Q94 -> Q68,Q72,Q78,Q85,Q94
  q: Q96,Q87,Q85,Q93,Q94 -> Q96,Q71,Q77,Q85,Q94

Coefficient literals drop from 108 to 101 bytes (P3,P2,P1,Q3,Q2,Q1 shrink by a
byte each; Q3 by two). The rational, the centering constant s, the bias and its
margin, ln(2) and the ray rescale are unchanged. The weighted sup-norm error
rises from 0.326 to 0.334 ulp -- still under the 0.3403-ulp margin -- so the
result remains floor(L) or floor(L)-1, monotonic, with lnWadToRay(1e18)=0.

Verified by an exact integer fixed-point model of the Horner/SDIV evaluation
(sup error <= margin and strictly antitone quotient over the domain) and end-to-
end against floor(1e27*ln(x/1e18)) at the domain extremes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The centering constant s = round(√2·2^W) is a PUSHed literal whose width tracks
the mantissa basis W. W=103 made s a 104-bit (13-byte) literal; the never-
overshoot floor only requires W ≥ 92 (the mantissa truncation is a downward-only
error < 2^-W·10^27, and margin + minimax + truncations + m_trunc must stay below
1 ulp). W=95 is the widest basis whose s fits in 12 bytes, so s drops from
0xb504f333f9de6484597d89b375 to 0xb504f333f9de6484597d89b3 -- one byte.

The mantissa truncation rises to 2^-95·10^27 ≈ 0.026 ulp (still far under the
0.3403 margin), lowering the exact-⌊L⌋ hit rate ~2% but never violating the
floor spec. The extraction shift becomes 0xa0 (=256-96), k ∈ [-95, 159], and the
bias is recomputed for the new s. Coefficients, rational, ln(2) and the margin
are unchanged.

Within an octave z = sdiv((s-m)·2^100, m+s) now steps by 8 to 16 per unit of m
(|∂z/∂m| ∈ [8,16] vs <1 at Q103); monotonicity still holds because z stays
strictly decreasing in m and the quotient is antitone in z, telescoping across
the multi-unit step. Verified end-to-end against floor(1e27·ln(x/1e18)) and for
monotonicity at the domain extremes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The within-octave monotonicity proof bounds the per-z-step descent of the
quotient by Rₘᵢₙ - zₘₐₓ·2J, where the truncation jitter J grows with each
Horner stage's renormalizing shift. Only p₁ (Q93→Q85) and q₃ (Q87→Q79) can
drop a byte while keeping that step margin positive (0.296 quotient units);
the other coefficients stay at bases where shortening them would drive the
margin below zero. Monotonicity -- not the rounding budget -- is the binding
constraint on these two stages.

Verified by check_ln_monotone (step_margin 0.296, all 254 clz seams) and
check_ln_counterexample (witness floors correctly).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Update the Lean proof to match src/vendor/Ln.sol after the Q95 mantissa and
the p1@Q85 / q3@Q79 coefficient bases: the stage sandwich constants, cert
exponents (PPc p1 at 2^271, QQc q3 at 2^113) and truncation slops, the
binade-shift width (160 - clz), the centering constant Sc = round(√2·2^95),
the bias, and the certificate cell cover are all regenerated for the new
constants. The mantissa now steps z by 8-16 per unit of m; monotonicity
telescopes that through the per-z-step antitone quotient unchanged.

model_ln_wad_floor, model_ln_wad_to_wad_floor, model_ln_wad_mono and
model_ln_wad_to_wad_mono build against the regenerated model and depend only
on [propext, Classical.choice, Quot.sound].

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
evmSar_sandwich_89/_98/_105 and evmShr_eq_div_152 were the divisors for the
old p3/q3/q2 stage shifts and the Q103 mantissa extraction; the Q95 build uses
97/90/113 and 160 instead, leaving these four unreferenced.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The constant term c0 and the q coefficients Q4, Q1 shipped values that were a
few ulp off the weighted-minimax quantization their first-principles derivation
yields (c0 +14, Q4 +2, Q1 +14 in the low byte). Move them to the exact minimax
quantization and resync the floor proof: the Stages stage-bound literals and
suffix polynomials for those coefficients, the regenerated certificate, the
re-run cell cover, and FloorBracket's divided-difference combined coefficients.

Bases, shifts, slops, the exponent staircase, and the monotonicity argument are
unchanged. model_ln_wad_floor / _to_wad_floor / _mono / _to_wad_mono build and
depend only on [propext, Classical.choice, Quot.sound]; check_ln_counterexample
and check_ln_monotone pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The x=10^18 correction now detects the floored output
(`add(iszero(not(r)), r)`, i.e. `r + (r == -1)`) instead of the input
(`eq(x, 10^18)`); the two agree because the floored result is -1 exactly at
x = 10^18. Regenerate the model and update the proof to match:

- OctaveMono: `lnTail` folds in the self-correction `s + (s == -1)`; add
  `corr_eq`/`corr_toInt`/`corr_mono` (the map sends -1 to 0 and is
  order-preserving); `affine_tail_mono` bounds the bare `sar` and `tail_mono`
  composes it with `corr_mono`.
- TopMono: `lnTail_bound`/`model_bound` drop the `one` parameter.
- FloorModel: `model_ne_zero` (the corrected model is nonzero off 10^18) lets
  `model_floor_bracket` peel the correction (`r != -1` there).
- FloorSpec: generalize the model word in the case analysis, and mark the model
  defs `local irreducible` so the linter does not whnf the (doubled) term.
- Remove the now-unused `evmEq_zero`/`evmEq_le_one`/`evmEq_comm`.

Update the Python mirrors (`check_ln_counterexample`, `check_ln_monotone`) to the
same correction. Full `lake build` is clean (93 jobs); the four top theorems
depend only on [propext, Classical.choice, Quot.sound]; both Python checks pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@duncancmt duncancmt changed the title WIP: add formally verified natural logarithm Add formally verified natural logarithm Jun 16, 2026
@duncancmt duncancmt marked this pull request as ready for review June 16, 2026 10:49
@duncancmt duncancmt requested a review from dekz as a code owner June 16, 2026 10:49
@immunefi-magnus

Copy link
Copy Markdown

🛡️ Immunefi PR Reviews

We noticed that your project isn't set up for automatic code reviews. If you'd like this PR reviewed by the Immunefi team, you can request it manually using the link below:

🔗 Send this PR in for review

Once submitted, we'll take care of assigning a reviewer and follow up here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider also adding a diff test for the python implementation in check_ln_counterexample.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants