Transcendental functions for Saloon#62
Conversation
Replace the naive Taylor exp in the @rs and @rd math doors with a Cody-Waite range reduction (x = k*ln2 + r) plus a minimax polynomial (degree 6 for @rs, 11 for @rd), faithful to <=1 ULP. Internals are forced to round-nearest-even -- a correctly-rounded transcendental has no rounding-mode axis, and this is what the forthcoming SoftFloat C jet will match bit-for-bit (the goal of PR #18). +scale2 is a correctly-rounded ldexp (2^k reconstruction) that is exact across the normal range and rounds exactly once into the overflow and subnormal tails. Full range guards: NaN, +-inf, overflow->inf, underflow->0, plus correctly-rounded subnormal results. The reference/oracle is libmath/tools/cheb_check.py: a strict-f64/f32 Python implementation of the identical algorithm, bit-identical to SoftFloat for the primitives used (+ - * /, round-to-int, ldexp), with mpmath as independent truth for coefficient design and the ULP bound. Verified bit-exact on both Vere variants -- vere64 (~tex) and vere32 (~hex) -- across the normal range and every tail; @rd shows no cross-variant divergence. New suite: tests/lib/math-exp (34 cases). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the naive (unreduced) Taylor log in the @rs and @rd math doors with x = 2^e * m reduction (m in [sqrt(1/2), sqrt(2))) plus the atanh series in the fdlibm-style form log(1+f) = f - s*(f - 2z*P2(z)), which keeps f as an exact leading term. Degree 4 (@rs) / 9 (@rd), faithful to <=1 ULP. Round-nearest-even internally to match the future SoftFloat jet bit-for-bit (PR #18). Adds the guards the old log lacked entirely: NaN, +inf, log(+-0)->-inf, log(x<0)->NaN, and subnormal inputs (normalised by 2^p before the bit-extraction reduction). Reference/oracle extended in libmath/tools/cheb_check.py (log @rd/@rs). Verified bit-exact on both Vere variants -- vere64 (~tex) and vere32 (~hex) -- over the normal range, both tails, and subnormal inputs. New suite: tests/lib/math-log (26 cases). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the naive mod-tau Taylor sin/cos in the @rs and @rd doors with a q*(pi/2) argument reduction + fdlibm __kernel_sin/__kernel_cos, selected by q&3 with sign symmetry. The reduced argument is carried as a two-part (rhi+rlo) value so the kernel's y-correction holds accuracy near zeros; @rd uses a 2-part pi/2, @rs a 3-part pi/2 (f32 needs the extra bits). Faithful to <=1 ULP for |x| <~ 500 (@rs) / 2^22 (@rd); guards NaN, +-inf -> NaN, and +-0 -> +-0 (sign preserved). sin/cos are thin wrappers over a shared per-door +rs-trig / +rd-trig engine; +tan keeps deriving as sin/cos and is now accurate via the new kernels. Round-nearest-even internally to match the future SoftFloat jet. Oracle extended in cheb_check.py (sin/cos @rd/@rs). Verified bit-exact on vere64 (~tex) and vere32 (~hex). New suite: tests/lib/math-trig (33). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
sqrt: - @rs delegates to the stdlib (SoftFloat) f32 root, which is correctly rounded. - @rd: the stdlib f64 root is only faithful (up to 1 ULP off, e.g. sqrt 2 -> ...bcc not ...bcd), so seed with it and apply one Markstein FMA correction (r = x - g*g via fma; g + (0.5/g)*r). Verified correctly-rounded over 600k inputs incl. +-1-ULP seeds -- so it matches the SoftFloat sqrt jet bit-for-bit (correct rounding is unique). Full guards: NaN, +inf, +-0, x<0 -> NaN. (Old code was a Newton iteration that stopped at the door rtol ~ 1e-5.) cbrt: was `pow x .0.3333` and asserted x>=0 (crashed on negatives). Now sign(x) * exp(log|x| / 3) using the accurate exp/log, defined for all reals, with NaN / +-0 guards. log-2 (= log/ln2), log-10 (= log/ln10), pow (= exp(n*log x)), pow-n all already derived from exp/log, so they improved automatically; add tests/lib/math-derived to lock their (deterministic, jet-reproducible) outputs. New suites: math-sqrt (18), math-derived (10). Bit-exact on vere64 (~tex) and vere32 (~hex); eig/complex regressions green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the slow AGM-iteration atan in the @rs and @rd doors with the fdlibm s_atan algorithm: reduce |x| against the breakpoints 7/16, 11/16, 19/16, 39/16 to a small argument near atan(.5)/atan(1)/atan(1.5)/ atan(inf), then a minimax polynomial (degree 4 @rs / 10 @rd) with odd symmetry. Faithful to <=1 ULP; guards NaN, +-inf -> +-pi/2, +-0 -> +-0. No huge-x threshold needed -- the id=3 (xr = -1/x) path handles arbitrarily large x (-> +-pi/2) on its own. atan is a thin wrapper over a shared per-door +rs-atan / +rd-atan |% engine (coeffs + reduction + kernel). atan2 already derives from atan via quadrant logic, so it improved automatically. Round-nearest-even internally to match the future SoftFloat jet. Oracle extended in cheb_check.py (atan @rd/@rs). Verified bit-exact on vere64 (~tex) and vere32 (~hex). New suite: tests/lib/math-atan (22). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add asin/acos to libmath/tools/cheb_check.py, all faithful to <=1 ULP over [-1,1]: asin @rd 0.84, acos @rd 0.89, asin @rs 0.85, acos @rs 0.98. fdlibm rational P/Q kernel (shared by asin/acos); the [0.5,1) range uses a sqrt head/tail split to recover low bits (needed at @rs too, not just @rd, to stay under 1 ULP). Small-x acos shortcut uses a 2^-26/2^-57 threshold (a too-large threshold returns pi/2 without subtracting x). Hoon swap into math.hoon is the next step. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the atan(x/sqrt(1-x^2)) asin/acos (inaccurate near +-1, and crashed out of domain) with the fdlibm rational P/Q kernel shared by both: |x|<0.5 uses x + x*R(x^2); [0.5,1) reduces via t=(1-|x|)/2, s=sqrt(t), with a sqrt head/tail split to recover low bits (needed at @rs too). acos recombines per-branch (small-x, x<=-0.5, x>=0.5). Faithful to <=1 ULP; |x|>1 -> NaN (no crash). The @rd kernel uses the door's now-correctly-rounded +sqt; @rs uses the SoftFloat f32 root. asin/acos are thin wrappers over a shared per-door +rs-ainv / +rd-ainv |% engine. Round-nearest-even internally to match the future jet. Oracle in cheb_check.py. Bit-exact on vere64 (~tex) and vere32 (~hex). New suite: tests/lib/math-ainv (26). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The constants were already correctly-rounded f64/f32 ln2/ln10, but log(x)/ln divides the whole result (losing the exactness of the integer exponent): log2 was 1.68 ULP, log10 2.39. Reuse the log reduction via a new per-door +lr helper returning [e (as float), log(mantissa)], then combine as e + log(m)/ln2 (log2) and e*log10(2) + log(m)/ln10 (log10), so the integer part carries no division rounding. Worst case drops to 1.36 (log2) / 1.61 (log10) ULP; log2(8) and log10(1000) at @rd are now exact. Guards: NaN, +inf, log(±0)=-inf, log(x<0)=NaN. Bit-exact on both variants; math-derived locks updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@rd tan now uses fdlibm __kernel_tan over the q*pi/2 reduction (odd q takes the -cot path with a head/tail 1/(x+r)): 0.76 ULP, vs ~1.9 for the sin/cos ratio. Thin wrapper over a new +rd-tan |% engine (redq + ktan). Guards: NaN, +-inf -> NaN, +-0 -> +-0. @rs tan stays as sin/cos (~1.2 ULP): a pure-f32 tan kernel is far worse near the poles (~9 ULP, since the cot reciprocal needs more than f32) -- FreeBSD's k_tanf computes in double for the same reason. Oracle extended in cheb_check.py (tan @rd/@rs). Bit-exact on vere64 (~tex) and vere32 (~hex). New suite: tests/lib/math-tan (11). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
f16's 10-bit significand is ~2^13 coarser than f32, so computing a transcendental in the (faithful) @rs door and rounding the result to f16 is correctly-rounded for f16 -- any <=few-ULP(f32) error is far below an f16 ULP. This reuses every @rs kernel instead of re-deriving 15 f16 algorithms. Adds two @rh-door helpers: +widen-hs (f16->f32, exact; subnormals via m*2^-24) and +narrow-sh (f32->f16, correctly-rounded round-nearest-even, with carry, overflow->inf, and subnormal handling). stdlib bit:rh truncates rather than rounds, so it cannot be used. Each @rh transcendental (exp, log, log-2, log-10, sin, cos, tan, asin, acos, atan, atan2, pow, pow-n, sqt, cbt) is now `narrow-sh (rs-fn (widen-hs x))`. Conversions verified against numpy float16 (28 cases); results verified correctly-rounded f16 vs numpy/mpmath (math-rh, 20 cases). Bit-exact on vere64 (~tex) and vere32 (~hex). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the @rq foundation to cheb_check.py: numpy has no true IEEE quad (its float128 is x87 80-bit ext), so model f128 with mpmath rounded to 113-bit RNE per op (= SoftFloat f128 for + - * / ldexp), plus 128-bit IEEE-quad encoding (qhex). exp @rq vertical slice: Cody-Waite + degree -24 minimax, 1.05 ULP over [-20,20] (the f128 analog of @rd's 0.99 -- Horner evaluation accumulation, essentially faithful). @rq SoftFloat ops confirmed jetted on-ship. The native rq Hoon (all functions, 128-bit constants) is the next step; rq must be native since f128 is the highest precision available (no higher door to round from, unlike @rh which routes through @rs). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per the plan to base the f128 work on the arithmetic the jets actually use, add libmath/tools/rq_check.c: the @rq transcendentals implemented in Berkeley SoftFloat (the f128M pointer API vendored in vere/ext/softblas) with MPFR as the correctly-rounded truth. This C code IS the jet body (modulo the u3 ABI), and its output is the algorithm-of-record the Hoon @rq must match bit-for-bit. exp @rq vertical slice: Cody-Waite + degree-24 minimax, 0.998 ULP over [-20,20] vs MPFR; output matches the mpmath f128 model (cheb_check.py exp-rq) bit-for-bit, cross-validating both. numpy has no true IEEE quad (its float128 is x87 80-bit ext), so SoftFloat + MPFR is the only sound f128 reference. Build recipe in the file header. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
rq_check.c now covers exp (0.998 ULP) and log (0.813 ULP) over the test range, both in SoftFloat f128 (jet basis) verified vs MPFR truth. log uses the same 2^e*m + atanh (fdlibm f - s*(f-R)) reduction as @rd at degree 22. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
rq_check.c now covers exp 0.998, log 0.813, sin 0.732, cos 0.709, sqrt 0.500 ULP -- all in SoftFloat f128 (jet basis), verified vs MPFR truth. sin/cos use the @rd q*pi/2 reduction + fdlibm kernels (two-part reduced arg) at degree ~16; sqrt is SoftFloat f128_sqrt directly (correctly rounded). Remaining for the C reference: tan, atan, asin, acos. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the naive Taylor exp in the @rq door with the faithful native f128 algorithm (Cody-Waite reduction + degree-24 minimax, bias 16383, 112-bit mantissa, S=112 subnormal scale), ported from @rd. Round -nearest-even internally; bit-exact against the SoftFloat-f128 reference tools/rq_check.c (the jet basis) on vere64 (~tex) and vere32 (~hex). New suite tests/lib/math-rq (9 exp cases incl. NaN/+-inf/+-0). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Port the @rd log to @rq: 2^e*m reduction + atanh series (fdlibm f - s*(f-R)) at degree 22, f128 (bias 16383, 112-bit mantissa, subnormal pre-scale 2^120). Guards: NaN, +inf, log(+-0)=-inf, log(x<0)=NaN. Bit-exact vs tools/rq_check.c on both variants. math-rq +4 log cases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
sqrt: stdlib f128 seed + one Markstein FMA -> correctly-rounded (matches SoftFloat f128_sqrt). sin/cos: q*pi/2 reduction + fdlibm kernels over a shared +rq-trig engine (two-part reduced arg), ported from @rd at f128. All bit-exact vs tools/rq_check.c on vere64 (~tex) and vere32 (~hex). math-rq now 21 cases (exp/log/sin/cos/sqrt). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Port the @rd derived transcendentals to @rq (f128): +lr (log reduction -> [e, log m]), log-2 = e + log(m)/ln2, log-10 = e*log10(2) + log(m)/ln10, pow = exp(n*log x) (integer fast path via pow-n), pow-n = repeated mul, cbrt = sign*exp(log|x|/3). All bit-exact on vere32 (~hex) against the f128 model; math-rq now 30 cases. NOTE: validated on vere32 only -- vere64's f128 SUBTRACTION jet is broken (mis-dispatches to a 256-bit complex @cq path: 1.0-0.877 -> [0.875|0.125], 254-bit), so any @rq fn using sub diverges there until that runtime jet is fixed. exp/sqrt (no cancellation-sensitive sub) and pow-n match on both. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
atan: fdlibm breakpoint reduction + degree-30 minimax (f128), 0.725 ULP; thin wrapper over a +rq-atan engine, ported from @rd. atan2 derives via quadrant logic with f128 pi. Bit-exact on vere32; math-rq +6 atan cases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
asin/acos: fdlibm rational-form kernel (degree-30 poly R, shared) + sqrt head/tail split on [0.5,1); f128. asin 0.959, acos 0.820 ULP. Replaces the old via-atan asin/acos (which crashed out of domain and hit ~10 ULP for acos via pi/2-asin). Over a shared +rq-ainv engine, ported from @rd. This completes the @rq transcendental door: exp, log, log-2, log-10, sin, cos, tan (sin/cos ratio ~2 ULP, as @rs), asin, acos, atan, atan2, sqrt, cbrt, pow, pow-n -- all bit-exact on vere32 against the SoftFloat f128 reference (math-rq, 45 cases). The full @rs/@rd/@rh/@rq transcendental surface is now faithful in Hoon. NOTE: vere64 still diverges on @rq fns using f128 sub (its sub jet mis-dispatches to a 256-bit complex path) -- a runtime bug to fix next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
vere64
|
| query | old binary | new binary |
|---|---|---|
(met 0 (sub:rq .~~~1.0 .~~~0.877)) |
254 | 126 |
| value | [0.877 | 0.123] (double-width) |
.~~~0.123… ✓ |
add/mul were always 126 bits; sub now matches. Every @rq function that routes through sub (e.g. cos) is therefore correct on the 64-bit build.
Runtime commits (urbit/vere):
urbit/vere#1022(neal/lagoon-softblas-64→ml/64): the fix —0ab2cd99ac.urbit/vere#1021(neal/lagoon-jet-fixes→develop): parity normalization introducing thenconstant on the 32-bit jet —22b09f21a0.
Tracking branch for this jet work: sigilante/transcendental-jets.
Implements the plan from #18 (which this supersedes): replace the naive
Taylor/AGM transcendentals in
/lib/mathwith range-reduced minimaxalgorithms that are faithful (≤1 ULP) and bit-exact across both Vere
variants — vere64 (
~tex) and vere32 (~hex) — so they can be jettedbit-for-bit against SoftFloat.
All internals are forced to round-nearest-even (a correctly-rounded
transcendental has no rounding-mode axis; the future SoftFloat C jet will
match this). Reference/oracle:
libmath/tools/cheb_check.py— a strictf64/f32 implementation of each algorithm (bit-identical to SoftFloat for the
primitives used) with mpmath as independent truth for the ULP bound.
Done (both @rs and @rd, swapped into math.hoon, tests green on both variants)
overflow→inf, underflow→0, correctly-rounded subnormals).
2^e·mreduction + atanh series (fdlibmf - s·(f-R)form);guards incl.
log(±0)=-inf,log(x<0)=NaN, subnormal inputs.q·π/2reduction + fdlibm kernels with two-part reducedargument; @rd 2-part π/2, @rs 3-part π/2.
Markstein FMA correction → correctly-rounded (matches the jet).
sign·exp(log|x|/3), defined for all reals (waspow x .333and crashed on negatives).
Test suites:
math-exp(34),math-log(26),math-trig(33),math-sqrt(18),
math-derived(10),math-atan(22).Remaining (in progress)
cheb_check.py(all four ≤1 ULP);Hoon swap next.
rh/rqdoors still naive (Phase 2), then the batched C-jet pass.Plan/design discussion: #18.
🤖 Generated with Claude Code