From 765ad2f9340334c16e02312d281ae4d1ea518799 Mon Sep 17 00:00:00 2001 From: Ariel Kellison Date: Sat, 30 May 2026 15:49:28 -0400 Subject: [PATCH 1/8] Add verif of cblas ddot --- C/cblas/README.md | 100 +++++++++ C/cblas/ddot.v | 436 +++++++++++++++++++++++++++++++++++++ C/cblas/ddot_model.v | 124 +++++++++++ C/cblas/include/cblas.h | 34 +++ C/cblas/spec_ddot.v | 69 ++++++ C/cblas/src/ddot.c | 19 ++ C/cblas/src/source_dot_r.h | 33 +++ C/cblas/verif_ddot.v | 93 ++++++++ Makefile | 1 + _CoqProject | 5 + 10 files changed, 914 insertions(+) create mode 100644 C/cblas/README.md create mode 100644 C/cblas/ddot.v create mode 100644 C/cblas/ddot_model.v create mode 100644 C/cblas/include/cblas.h create mode 100644 C/cblas/spec_ddot.v create mode 100644 C/cblas/src/ddot.c create mode 100644 C/cblas/src/source_dot_r.h create mode 100644 C/cblas/verif_ddot.v diff --git a/C/cblas/README.md b/C/cblas/README.md new file mode 100644 index 0000000..2c24c23 --- /dev/null +++ b/C/cblas/README.md @@ -0,0 +1,100 @@ +# LAProof × GSL CBLAS + +This directory connects **real-world BLAS code** to LAProof's verified floating-point +accuracy results. + +## Scope + +LAProof proves floating-point accuracy theorems (forward error bounds, etc.) about +*functional models* of basic linear-algebra programs (idealized list/matrix functions +that compute a result with a precisely specified sequence of IEEE-754 operations). +Those theorems live in [`accuracy_proofs/`](../../accuracy_proofs/), independent of any +particular C implementation. + +The goal of this directory is the other half of the story: take a **real-world BLAS +implementation** and *prove* that the compiled C code implements one of those +functional models. We use the CBLAS layer of the **GNU Scientific Library (GSL)** as the +implementation under verification. + +The pipeline for each routine is: + +1. Take the GSL CBLAS source (under [`src/`](src/)) essentially verbatim, with only the + minimal edits needed to run CompCert's `clightgen` on it. +2. Generate the Clight AST (a `*.v` file) with `clightgen`. +3. Define the **functional model** the C loop literally computes (in a `*_model.v` file), + and prove it equals (up to `feq`) the LAProof accuracy model over which the error + theorems are stated. +4. Write a VST **funspec** (in a `spec_*.v` file) whose postcondition relates the returned + value to the LAProof model. +5. **Verify** the C function against that funspec with VST (in a `verif_*.v` file). + +Once a routine is verified against a funspec phrased in terms of a LAProof model, the +corresponding accuracy theorem can be transferred to the value computed by the compiled C +code. The verification does *not* prove the accuracy theorem's preconditions on its own, +however. To apply the bound to the C result, one must separately establish that those +preconditions hold, and they fall into two kinds. + +- **Already guaranteed by the funspec precondition.** Some of the theorem's preconditions + are exactly things the funspec `PRE` already requires, so they hold for free (e.g. a + length precondition that the funspec already imposes on the input arrays). +- **Additional conditions one must discharge.** Other preconditions are not established by + the funspec and must be proved on the side. A typical example is a no-overflow assumption + that the accumulated result is finite: this does *not* follow from the funspec merely + requiring the inputs to be finite, since a combination of finite values can still + overflow to an infinity. Whoever applies the bound must supply this separately. + +So the honest statement is: for inputs satisfying both the funspec `PRE` and the accuracy +theorem's own hypotheses, the bound holds of the C return value. In many cases (such as the +dot products and other reduction loops verified here) the one hypothesis the funspec does +not already supply is a *no-overflow* condition: that the accumulated result is finite. + +For `cblas_ddot` that leftover precondition is exactly `Hfin`: that the accumulated result +`dotprodF X Y` is finite. It does not follow merely from the inputs being finite, since a +sum of finite products can still round to an infinity. + +The `feq` postcondition cooperates here: the funspec only guarantees that the C result is +`feq`-equal to the model value (equality up to `±0`/NaN), but once the result is finite, +`feq` upgrades to real-value equality, so the bound on the model value is literally a bound +on the compiled C output. + +When stating the accuracy corollary for the C function, you choose how to carry `Hfin`: + +- **Conditional corollary.** Keep `Hfin` as an explicit hypothesis. The bound then holds + whenever the result is in fact finite, and whoever applies the corollary discharges + `Hfin` for their particular inputs. (`Hfin` is a proof obligation about the result, + not a runtime check in the C code.) +- **Unconditional corollary.** Strengthen the funspec `PRE` with magnitude/length bounds + that imply finiteness. LAProof's [`finite_sum_from_bounded`](../../accuracy_proofs/sum_is_finite.v) + and the `fun_bnd_le` lemmas turn such bounds into a proof of `Hfin`, so the corollary + carries no extra hypothesis. + +### GSL CBLAS upstream + +- GSL project home: +- GSL source repository (Savannah git): +- CBLAS directory in the repo: +- Shared BLAS kernels (`source_*.h`) used by the CBLAS routines: + +GSL is licensed under the GNU GPL; the files under [`src/`](src/) retain their upstream +copyright and license headers. + +## What we've verified so far + +| Operation | Precision | Function | Files | +|-----------|-----------|----------|-------| +| Dot product | double | `cblas_ddot` | [`src/ddot.c`](src/ddot.c), [`src/source_dot_r.h`](src/source_dot_r.h), [`include/cblas.h`](include/cblas.h), [`ddot.v`](ddot.v), [`ddot_model.v`](ddot_model.v), [`spec_ddot.v`](spec_ddot.v), [`verif_ddot.v`](verif_ddot.v) | + +**Scope limits (`cblas_ddot`):** +- **Unit stride only** (`incX = incY = 1`). General strides are out of scope: LAProof has + no strided/gather separation-logic predicate, and the unit-stride case is the one that + maps directly onto the list-based dot-product model. +- The postcondition is stated up to `feq` (IEEE-equality that identifies `+0.0`/`-0.0` and + NaN payloads), because the C accumulation adds the accumulator first while the model + folds product-first, and IEEE addition is commutative only up to `feq`. + +## Conventions + +This directory mirrors the model/spec/verif split used elsewhere in +[`C/`](../) (e.g. the sparse routines' `sparse_model.v` / `spec_sparse.v` / +`verif_sparse.v`): pure model functions and their lemmas in `*_model.v`, VST funspecs in +`spec_*.v`, and `semax_body` proofs in `verif_*.v`. diff --git a/C/cblas/ddot.v b/C/cblas/ddot.v new file mode 100644 index 0000000..e47b5b3 --- /dev/null +++ b/C/cblas/ddot.v @@ -0,0 +1,436 @@ +From Coq Require Import String List ZArith. +From compcert Require Import Coqlib Integers Floats AST Ctypes Cop Clight Clightdefs. +Import Clightdefs.ClightNotations. +Local Open Scope Z_scope. +Local Open Scope string_scope. +Local Open Scope clight_scope. + +Module Info. + Definition version := "3.17". + Definition build_number := "". + Definition build_tag := "". + Definition build_branch := "". + Definition arch := "aarch64". + Definition model := "default". + Definition abi := "apple". + Definition bitsize := 64. + Definition big_endian := false. + Definition source_file := "C/cblas/src/ddot.c". + Definition normalized := true. +End Info. + +Definition _N : ident := $"N". +Definition _X : ident := $"X". +Definition _Y : ident := $"Y". +Definition ___builtin_annot : ident := $"__builtin_annot". +Definition ___builtin_annot_intval : ident := $"__builtin_annot_intval". +Definition ___builtin_bswap : ident := $"__builtin_bswap". +Definition ___builtin_bswap16 : ident := $"__builtin_bswap16". +Definition ___builtin_bswap32 : ident := $"__builtin_bswap32". +Definition ___builtin_bswap64 : ident := $"__builtin_bswap64". +Definition ___builtin_cls : ident := $"__builtin_cls". +Definition ___builtin_clsl : ident := $"__builtin_clsl". +Definition ___builtin_clsll : ident := $"__builtin_clsll". +Definition ___builtin_clz : ident := $"__builtin_clz". +Definition ___builtin_clzl : ident := $"__builtin_clzl". +Definition ___builtin_clzll : ident := $"__builtin_clzll". +Definition ___builtin_ctz : ident := $"__builtin_ctz". +Definition ___builtin_ctzl : ident := $"__builtin_ctzl". +Definition ___builtin_ctzll : ident := $"__builtin_ctzll". +Definition ___builtin_debug : ident := $"__builtin_debug". +Definition ___builtin_expect : ident := $"__builtin_expect". +Definition ___builtin_fabs : ident := $"__builtin_fabs". +Definition ___builtin_fabsf : ident := $"__builtin_fabsf". +Definition ___builtin_fmadd : ident := $"__builtin_fmadd". +Definition ___builtin_fmax : ident := $"__builtin_fmax". +Definition ___builtin_fmin : ident := $"__builtin_fmin". +Definition ___builtin_fmsub : ident := $"__builtin_fmsub". +Definition ___builtin_fnmadd : ident := $"__builtin_fnmadd". +Definition ___builtin_fnmsub : ident := $"__builtin_fnmsub". +Definition ___builtin_fsqrt : ident := $"__builtin_fsqrt". +Definition ___builtin_membar : ident := $"__builtin_membar". +Definition ___builtin_memcpy_aligned : ident := $"__builtin_memcpy_aligned". +Definition ___builtin_sel : ident := $"__builtin_sel". +Definition ___builtin_sqrt : ident := $"__builtin_sqrt". +Definition ___builtin_unreachable : ident := $"__builtin_unreachable". +Definition ___builtin_va_arg : ident := $"__builtin_va_arg". +Definition ___builtin_va_copy : ident := $"__builtin_va_copy". +Definition ___builtin_va_end : ident := $"__builtin_va_end". +Definition ___builtin_va_start : ident := $"__builtin_va_start". +Definition ___compcert_i64_dtos : ident := $"__compcert_i64_dtos". +Definition ___compcert_i64_dtou : ident := $"__compcert_i64_dtou". +Definition ___compcert_i64_sar : ident := $"__compcert_i64_sar". +Definition ___compcert_i64_sdiv : ident := $"__compcert_i64_sdiv". +Definition ___compcert_i64_shl : ident := $"__compcert_i64_shl". +Definition ___compcert_i64_shr : ident := $"__compcert_i64_shr". +Definition ___compcert_i64_smod : ident := $"__compcert_i64_smod". +Definition ___compcert_i64_smulh : ident := $"__compcert_i64_smulh". +Definition ___compcert_i64_stod : ident := $"__compcert_i64_stod". +Definition ___compcert_i64_stof : ident := $"__compcert_i64_stof". +Definition ___compcert_i64_udiv : ident := $"__compcert_i64_udiv". +Definition ___compcert_i64_umod : ident := $"__compcert_i64_umod". +Definition ___compcert_i64_umulh : ident := $"__compcert_i64_umulh". +Definition ___compcert_i64_utod : ident := $"__compcert_i64_utod". +Definition ___compcert_i64_utof : ident := $"__compcert_i64_utof". +Definition ___compcert_va_composite : ident := $"__compcert_va_composite". +Definition ___compcert_va_float64 : ident := $"__compcert_va_float64". +Definition ___compcert_va_int32 : ident := $"__compcert_va_int32". +Definition ___compcert_va_int64 : ident := $"__compcert_va_int64". +Definition _cblas_ddot : ident := $"cblas_ddot". +Definition _i : ident := $"i". +Definition _incX : ident := $"incX". +Definition _incY : ident := $"incY". +Definition _ix : ident := $"ix". +Definition _iy : ident := $"iy". +Definition _main : ident := $"main". +Definition _r : ident := $"r". +Definition _t'1 : ident := 128%positive. +Definition _t'2 : ident := 129%positive. +Definition _t'3 : ident := 130%positive. +Definition _t'4 : ident := 131%positive. + +Definition f_cblas_ddot := {| + fn_return := tdouble; + fn_callconv := cc_default; + fn_params := ((_N, tint) :: (_X, (tptr tdouble)) :: (_incX, tint) :: + (_Y, (tptr tdouble)) :: (_incY, tint) :: nil); + fn_vars := nil; + fn_temps := ((_r, tdouble) :: (_i, tint) :: (_ix, tint) :: (_iy, tint) :: + (_t'2, tint) :: (_t'1, tint) :: (_t'4, tdouble) :: + (_t'3, tdouble) :: nil); + fn_body := +(Ssequence + (Sset _r (Econst_float (Float.of_bits (Int64.repr 0)) tdouble)) + (Ssequence + (Ssequence + (Sifthenelse (Ebinop Ogt (Etempvar _incX tint) + (Econst_int (Int.repr 0) tint) tint) + (Sset _t'1 (Ecast (Econst_int (Int.repr 0) tint) tint)) + (Sset _t'1 + (Ecast + (Ebinop Omul + (Ebinop Osub (Etempvar _N tint) (Econst_int (Int.repr 1) tint) + tint) (Eunop Oneg (Etempvar _incX tint) tint) tint) tint))) + (Sset _ix (Etempvar _t'1 tint))) + (Ssequence + (Ssequence + (Sifthenelse (Ebinop Ogt (Etempvar _incY tint) + (Econst_int (Int.repr 0) tint) tint) + (Sset _t'2 (Ecast (Econst_int (Int.repr 0) tint) tint)) + (Sset _t'2 + (Ecast + (Ebinop Omul + (Ebinop Osub (Etempvar _N tint) + (Econst_int (Int.repr 1) tint) tint) + (Eunop Oneg (Etempvar _incY tint) tint) tint) tint))) + (Sset _iy (Etempvar _t'2 tint))) + (Ssequence + (Ssequence + (Sset _i (Econst_int (Int.repr 0) tint)) + (Sloop + (Ssequence + (Sifthenelse (Ebinop Olt (Etempvar _i tint) (Etempvar _N tint) + tint) + Sskip + Sbreak) + (Ssequence + (Ssequence + (Sset _t'3 + (Ederef + (Ebinop Oadd (Etempvar _X (tptr tdouble)) + (Etempvar _ix tint) (tptr tdouble)) tdouble)) + (Ssequence + (Sset _t'4 + (Ederef + (Ebinop Oadd (Etempvar _Y (tptr tdouble)) + (Etempvar _iy tint) (tptr tdouble)) tdouble)) + (Sset _r + (Ebinop Oadd (Etempvar _r tdouble) + (Ebinop Omul (Etempvar _t'3 tdouble) + (Etempvar _t'4 tdouble) tdouble) tdouble)))) + (Ssequence + (Sset _ix + (Ebinop Oadd (Etempvar _ix tint) (Etempvar _incX tint) + tint)) + (Sset _iy + (Ebinop Oadd (Etempvar _iy tint) (Etempvar _incY tint) + tint))))) + (Sset _i + (Ebinop Oadd (Etempvar _i tint) (Econst_int (Int.repr 1) tint) + tint)))) + (Sreturn (Some (Etempvar _r tdouble))))))) +|}. + +Definition composites : list composite_definition := +nil. + +Definition global_definitions : list (ident * globdef fundef type) := +((___compcert_va_int32, + Gfun(External (EF_runtime "__compcert_va_int32" + (mksignature (AST.Xptr :: nil) AST.Xint cc_default)) + ((tptr tvoid) :: nil) tuint cc_default)) :: + (___compcert_va_int64, + Gfun(External (EF_runtime "__compcert_va_int64" + (mksignature (AST.Xptr :: nil) AST.Xlong cc_default)) + ((tptr tvoid) :: nil) tulong cc_default)) :: + (___compcert_va_float64, + Gfun(External (EF_runtime "__compcert_va_float64" + (mksignature (AST.Xptr :: nil) AST.Xfloat cc_default)) + ((tptr tvoid) :: nil) tdouble cc_default)) :: + (___compcert_va_composite, + Gfun(External (EF_runtime "__compcert_va_composite" + (mksignature (AST.Xptr :: AST.Xlong :: nil) AST.Xptr + cc_default)) ((tptr tvoid) :: tulong :: nil) + (tptr tvoid) cc_default)) :: + (___compcert_i64_dtos, + Gfun(External (EF_runtime "__compcert_i64_dtos" + (mksignature (AST.Xfloat :: nil) AST.Xlong cc_default)) + (tdouble :: nil) tlong cc_default)) :: + (___compcert_i64_dtou, + Gfun(External (EF_runtime "__compcert_i64_dtou" + (mksignature (AST.Xfloat :: nil) AST.Xlong cc_default)) + (tdouble :: nil) tulong cc_default)) :: + (___compcert_i64_stod, + Gfun(External (EF_runtime "__compcert_i64_stod" + (mksignature (AST.Xlong :: nil) AST.Xfloat cc_default)) + (tlong :: nil) tdouble cc_default)) :: + (___compcert_i64_utod, + Gfun(External (EF_runtime "__compcert_i64_utod" + (mksignature (AST.Xlong :: nil) AST.Xfloat cc_default)) + (tulong :: nil) tdouble cc_default)) :: + (___compcert_i64_stof, + Gfun(External (EF_runtime "__compcert_i64_stof" + (mksignature (AST.Xlong :: nil) AST.Xsingle cc_default)) + (tlong :: nil) tfloat cc_default)) :: + (___compcert_i64_utof, + Gfun(External (EF_runtime "__compcert_i64_utof" + (mksignature (AST.Xlong :: nil) AST.Xsingle cc_default)) + (tulong :: nil) tfloat cc_default)) :: + (___compcert_i64_sdiv, + Gfun(External (EF_runtime "__compcert_i64_sdiv" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___compcert_i64_udiv, + Gfun(External (EF_runtime "__compcert_i64_udiv" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tulong :: tulong :: nil) tulong + cc_default)) :: + (___compcert_i64_smod, + Gfun(External (EF_runtime "__compcert_i64_smod" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___compcert_i64_umod, + Gfun(External (EF_runtime "__compcert_i64_umod" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tulong :: tulong :: nil) tulong + cc_default)) :: + (___compcert_i64_shl, + Gfun(External (EF_runtime "__compcert_i64_shl" + (mksignature (AST.Xlong :: AST.Xint :: nil) AST.Xlong + cc_default)) (tlong :: tint :: nil) tlong cc_default)) :: + (___compcert_i64_shr, + Gfun(External (EF_runtime "__compcert_i64_shr" + (mksignature (AST.Xlong :: AST.Xint :: nil) AST.Xlong + cc_default)) (tulong :: tint :: nil) tulong cc_default)) :: + (___compcert_i64_sar, + Gfun(External (EF_runtime "__compcert_i64_sar" + (mksignature (AST.Xlong :: AST.Xint :: nil) AST.Xlong + cc_default)) (tlong :: tint :: nil) tlong cc_default)) :: + (___compcert_i64_smulh, + Gfun(External (EF_runtime "__compcert_i64_smulh" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___compcert_i64_umulh, + Gfun(External (EF_runtime "__compcert_i64_umulh" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tulong :: tulong :: nil) tulong + cc_default)) :: + (___builtin_bswap64, + Gfun(External (EF_builtin "__builtin_bswap64" + (mksignature (AST.Xlong :: nil) AST.Xlong cc_default)) + (tulong :: nil) tulong cc_default)) :: + (___builtin_bswap, + Gfun(External (EF_builtin "__builtin_bswap" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tuint cc_default)) :: + (___builtin_bswap32, + Gfun(External (EF_builtin "__builtin_bswap32" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tuint cc_default)) :: + (___builtin_bswap16, + Gfun(External (EF_builtin "__builtin_bswap16" + (mksignature (AST.Xint16unsigned :: nil) + AST.Xint16unsigned cc_default)) (tushort :: nil) tushort + cc_default)) :: + (___builtin_clz, + Gfun(External (EF_builtin "__builtin_clz" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tint cc_default)) :: + (___builtin_clzl, + Gfun(External (EF_builtin "__builtin_clzl" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_clzll, + Gfun(External (EF_builtin "__builtin_clzll" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_ctz, + Gfun(External (EF_builtin "__builtin_ctz" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tint cc_default)) :: + (___builtin_ctzl, + Gfun(External (EF_builtin "__builtin_ctzl" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_ctzll, + Gfun(External (EF_builtin "__builtin_ctzll" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_fabs, + Gfun(External (EF_builtin "__builtin_fabs" + (mksignature (AST.Xfloat :: nil) AST.Xfloat cc_default)) + (tdouble :: nil) tdouble cc_default)) :: + (___builtin_fabsf, + Gfun(External (EF_builtin "__builtin_fabsf" + (mksignature (AST.Xsingle :: nil) AST.Xsingle cc_default)) + (tfloat :: nil) tfloat cc_default)) :: + (___builtin_fsqrt, + Gfun(External (EF_builtin "__builtin_fsqrt" + (mksignature (AST.Xfloat :: nil) AST.Xfloat cc_default)) + (tdouble :: nil) tdouble cc_default)) :: + (___builtin_sqrt, + Gfun(External (EF_builtin "__builtin_sqrt" + (mksignature (AST.Xfloat :: nil) AST.Xfloat cc_default)) + (tdouble :: nil) tdouble cc_default)) :: + (___builtin_memcpy_aligned, + Gfun(External (EF_builtin "__builtin_memcpy_aligned" + (mksignature + (AST.Xptr :: AST.Xptr :: AST.Xlong :: AST.Xlong :: nil) + AST.Xvoid cc_default)) + ((tptr tvoid) :: (tptr tvoid) :: tulong :: tulong :: nil) tvoid + cc_default)) :: + (___builtin_sel, + Gfun(External (EF_builtin "__builtin_sel" + (mksignature (AST.Xbool :: nil) AST.Xvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) + (tbool :: nil) tvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) :: + (___builtin_annot, + Gfun(External (EF_builtin "__builtin_annot" + (mksignature (AST.Xptr :: nil) AST.Xvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) + ((tptr tschar) :: nil) tvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) :: + (___builtin_annot_intval, + Gfun(External (EF_builtin "__builtin_annot_intval" + (mksignature (AST.Xptr :: AST.Xint :: nil) AST.Xint + cc_default)) ((tptr tschar) :: tint :: nil) tint + cc_default)) :: + (___builtin_membar, + Gfun(External (EF_builtin "__builtin_membar" + (mksignature nil AST.Xvoid cc_default)) nil tvoid + cc_default)) :: + (___builtin_va_start, + Gfun(External (EF_builtin "__builtin_va_start" + (mksignature (AST.Xptr :: nil) AST.Xvoid cc_default)) + ((tptr tvoid) :: nil) tvoid cc_default)) :: + (___builtin_va_arg, + Gfun(External (EF_builtin "__builtin_va_arg" + (mksignature (AST.Xptr :: AST.Xint :: nil) AST.Xvoid + cc_default)) ((tptr tvoid) :: tuint :: nil) tvoid + cc_default)) :: + (___builtin_va_copy, + Gfun(External (EF_builtin "__builtin_va_copy" + (mksignature (AST.Xptr :: AST.Xptr :: nil) AST.Xvoid + cc_default)) ((tptr tvoid) :: (tptr tvoid) :: nil) tvoid + cc_default)) :: + (___builtin_va_end, + Gfun(External (EF_builtin "__builtin_va_end" + (mksignature (AST.Xptr :: nil) AST.Xvoid cc_default)) + ((tptr tvoid) :: nil) tvoid cc_default)) :: + (___builtin_unreachable, + Gfun(External (EF_builtin "__builtin_unreachable" + (mksignature nil AST.Xvoid cc_default)) nil tvoid + cc_default)) :: + (___builtin_expect, + Gfun(External (EF_builtin "__builtin_expect" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___builtin_cls, + Gfun(External (EF_builtin "__builtin_cls" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tint :: nil) tint cc_default)) :: + (___builtin_clsl, + Gfun(External (EF_builtin "__builtin_clsl" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tlong :: nil) tint cc_default)) :: + (___builtin_clsll, + Gfun(External (EF_builtin "__builtin_clsll" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tlong :: nil) tint cc_default)) :: + (___builtin_fmadd, + Gfun(External (EF_builtin "__builtin_fmadd" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fmsub, + Gfun(External (EF_builtin "__builtin_fmsub" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fnmadd, + Gfun(External (EF_builtin "__builtin_fnmadd" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fnmsub, + Gfun(External (EF_builtin "__builtin_fnmsub" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fmax, + Gfun(External (EF_builtin "__builtin_fmax" + (mksignature (AST.Xfloat :: AST.Xfloat :: nil) AST.Xfloat + cc_default)) (tdouble :: tdouble :: nil) tdouble + cc_default)) :: + (___builtin_fmin, + Gfun(External (EF_builtin "__builtin_fmin" + (mksignature (AST.Xfloat :: AST.Xfloat :: nil) AST.Xfloat + cc_default)) (tdouble :: tdouble :: nil) tdouble + cc_default)) :: + (___builtin_debug, + Gfun(External (EF_external "__builtin_debug" + (mksignature (AST.Xint :: nil) AST.Xvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) + (tint :: nil) tvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) :: + (_cblas_ddot, Gfun(Internal f_cblas_ddot)) :: nil). + +Definition public_idents : list ident := +(_cblas_ddot :: ___builtin_debug :: ___builtin_fmin :: ___builtin_fmax :: + ___builtin_fnmsub :: ___builtin_fnmadd :: ___builtin_fmsub :: + ___builtin_fmadd :: ___builtin_clsll :: ___builtin_clsl :: ___builtin_cls :: + ___builtin_expect :: ___builtin_unreachable :: ___builtin_va_end :: + ___builtin_va_copy :: ___builtin_va_arg :: ___builtin_va_start :: + ___builtin_membar :: ___builtin_annot_intval :: ___builtin_annot :: + ___builtin_sel :: ___builtin_memcpy_aligned :: ___builtin_sqrt :: + ___builtin_fsqrt :: ___builtin_fabsf :: ___builtin_fabs :: + ___builtin_ctzll :: ___builtin_ctzl :: ___builtin_ctz :: ___builtin_clzll :: + ___builtin_clzl :: ___builtin_clz :: ___builtin_bswap16 :: + ___builtin_bswap32 :: ___builtin_bswap :: ___builtin_bswap64 :: + ___compcert_i64_umulh :: ___compcert_i64_smulh :: ___compcert_i64_sar :: + ___compcert_i64_shr :: ___compcert_i64_shl :: ___compcert_i64_umod :: + ___compcert_i64_smod :: ___compcert_i64_udiv :: ___compcert_i64_sdiv :: + ___compcert_i64_utof :: ___compcert_i64_stof :: ___compcert_i64_utod :: + ___compcert_i64_stod :: ___compcert_i64_dtou :: ___compcert_i64_dtos :: + ___compcert_va_composite :: ___compcert_va_float64 :: + ___compcert_va_int64 :: ___compcert_va_int32 :: nil). + +Definition prog : Clight.program := + mkprogram composites global_definitions public_idents _main Logic.I. + + diff --git a/C/cblas/ddot_model.v b/C/cblas/ddot_model.v new file mode 100644 index 0000000..4207787 --- /dev/null +++ b/C/cblas/ddot_model.v @@ -0,0 +1,124 @@ +(** * LAProof.C.cblas.ddot_model: functional model of GSL's [cblas_ddot]. *) +(** ** Corresponds to C program [C/cblas/src/ddot.c] (ported from GSL cblas). *) + +(** This file plays the same role for [cblas_ddot] that [LAProof.C.sparse_model] + plays for the sparse routines: it defines the partial-accumulation *model + function* the loop invariant tracks, together with the *start* / *step* / + *end* lemmas about it (cf. [partial_row]/[partial_row_start]/ + [partial_row_next]/[partial_row_end]). No VST [funspec]s or Clight ASTs + appear here -- those live in [spec_ddot] and [verif_ddot]. + + GSL's loop (after macro expansion of [source_dot_r.h]) is +<< + double r = 0.0; + for (i = 0; i < N; i++) { r += X[ix] * Y[iy]; ix += incX; iy += incY; } + return r; +>> + i.e. a forward, left-to-right accumulation using *separate* multiply-then-add + (not fused multiply-add), starting from +0.0. That is exactly the [dotprodF] + model of [LAProof.accuracy_proofs.dotprod_model] + ([dotprodF = dotprod BMULT BPLUS pos_zero]), over which the forward-error + theorem [dot_acc.dotprod_forward_error] is proved -- *not* the FMA-based + [floatlib.dotprod]. + + Note on operand order: the C statement [r += X*Y] computes [BPLUS acc prod] + (accumulator first), whereas [dotprodF]'s fold step is [BPLUS prod acc] + (product first). IEEE addition is commutative *up to [feq]*, hence the + end/bridge lemma below relates the two by [feq]. *) + +Require Import VST.floyd.proofauto. +Require Import vcfloat.VCFloat. +Require Import vcfloat.FPStdCompCert. +From vcfloat Require Import FPStdLib. +From LAProof.C Require Import floatlib. +(* [dotprod_model] only [Require Import]s (not [Export]s) the mathcomp preamble, + so this does not pollute the VST namespace with ssreflect notations. We do + NOT [Import common]: that would bring mathcomp's order/ring notations into + scope (making [<=]/[<] parse over [R] and breaking [ddot_model_step]). + Instead [common.BPLUS_comm] / [common.pos_zero] are used qualified below; + [BPLUS_mor] and the [feq] [Equivalence] come from vcfloat. *) +Require Import LAProof.accuracy_proofs.dotprod_model. + +Set Bullet Behavior "Strict Subproofs". + +(** [ddot_model X Y] is the value the C loop *literally* computes: a left fold + with the accumulator as the first [BPLUS] operand, separate multiply then + add, starting from [+0.0]. This mirrors the Clight AST in [ddot.v] exactly, + which makes it the natural value to carry in the loop invariant. *) +Definition ddot_loop (xy: list (ftype Tdouble * ftype Tdouble)) : ftype Tdouble := + fold_left (fun acc p => BPLUS acc (BMULT (fst p) (snd p))) xy (Zconst Tdouble 0). + +Definition ddot_model (X Y: list (ftype Tdouble)) : ftype Tdouble := + ddot_loop (combine X Y). + +(** ** Model lemmas (the [partial_row_*] triple, specialised to a dense ddot). *) + +Lemma combine_app_eqlen {A B} (l1 l1': list A) (l2 l2': list B): + length l1 = length l2 -> + combine (l1 ++ l1') (l2 ++ l2') = combine l1 l2 ++ combine l1' l2'. +Proof. + revert l2; induction l1; destruct l2; simpl; intros; try discriminate; auto. + f_equal; auto. +Qed. + +Lemma ddot_loop_snoc: forall xy p, + ddot_loop (xy ++ [p]) = BPLUS (ddot_loop xy) (BMULT (fst p) (snd p)). +Proof. intros. unfold ddot_loop. rewrite fold_left_app. reflexivity. Qed. + +(** *step*: extending both length-[k] prefixes by their [k]-th element adds one + [BMULT] term, with the accumulator as the first [BPLUS] operand -- exactly + the Clight statement [r = r + X[k]*Y[k]]. (cf. [partial_row_next]) *) +Lemma ddot_model_step: forall (X Y: list (ftype Tdouble)) k, + Zlength X = Zlength Y -> 0 <= k < Zlength X -> + ddot_model (sublist 0 (k+1) X) (sublist 0 (k+1) Y) + = BPLUS (ddot_model (sublist 0 k X) (sublist 0 k Y)) + (BMULT (Znth k X) (Znth k Y)). +Proof. + intros X Y k Hlen Hk. unfold ddot_model. + rewrite (sublist_last_1 0 k X) by lia. + rewrite (sublist_last_1 0 k Y) by lia. + rewrite combine_app_eqlen + by (apply Nat2Z.inj; rewrite <- !Zlength_correct, !Zlength_sublist by lia; lia). + cbn [combine]. (* combine [Znth k X] [Znth k Y] = [(Znth k X, Znth k Y)] *) + rewrite ddot_loop_snoc. reflexivity. +Qed. + +(** *end*/bridge: the finished accumulation equals LAProof's accuracy model + [dotprodF] up to [feq]. They differ only by the [BPLUS] operand order + ([BPLUS acc prod] here vs [BPLUS prod acc] in [dotprodF]'s fold step), so the + proof is a generalized-accumulator induction using [common.BPLUS_comm] + (feq (BPLUS x y) (BPLUS y x)) and the [BPLUS_mor] feq-congruence, plus the + fact that [List.combine = seq.zip] on equal-length lists and that the two + initial zeros ([Zconst Tdouble 0] vs [pos_zero]) are [feq]. This is the one + remaining model obligation; once discharged, [dot_acc.dotprod_forward_error] + transfers to the C result. *) +Lemma ddot_model_feq_dotprodF: + forall (X Y: list (ftype Tdouble)), + Zlength X = Zlength Y -> + Forall finite X -> Forall finite Y -> + feq (ddot_model X Y) (dotprodF X Y). +Proof. + intros X Y Hlen _ _. + assert (Hl: length X = length Y) + by (apply Nat2Z.inj; rewrite <- !Zlength_correct; exact Hlen). + clear Hlen. + unfold ddot_model, ddot_loop, dotprodF, dotprod. + (* The two folds differ only by the initial accumulator ([Zconst Tdouble 0] vs + [pos_zero], which are [feq]) and the [BPLUS] operand order at each step. + Generalize both accumulators, keeping them [feq]-linked, and induct. *) + assert (Hs: feq (Zconst Tdouble 0) (@common.pos_zero Tdouble)) by reflexivity. + revert Y Hl Hs. + generalize (Zconst Tdouble 0). + generalize (@common.pos_zero Tdouble). + induction X as [|x X IH]; intros g f Y Hl Hs. + - (* X = nil (so Y = nil): both folds are the (feq-linked) initial accs. *) + destruct Y as [|y Y']; [ | simpl in Hl; discriminate ]. + simpl. exact Hs. + - destruct Y as [|y Y']; simpl in Hl; [ discriminate | ]. + simpl. + apply IH. + + congruence. + + (* one accumulation step: [BPLUS f prod] vs [flip BPLUS g prod = BPLUS prod g] *) + simpl. unfold Basics.flip. + etransitivity; [ apply common.BPLUS_comm | apply BPLUS_mor; [ reflexivity | exact Hs ] ]. +Qed. diff --git a/C/cblas/include/cblas.h b/C/cblas/include/cblas.h new file mode 100644 index 0000000..b478671 --- /dev/null +++ b/C/cblas/include/cblas.h @@ -0,0 +1,34 @@ +#define INDEX int +#define OFFSET(N, incX) ((incX) > 0 ? 0 : ((N) - 1) * (-(incX))) +#define BLAS_ERROR(x) cblas_xerbla(0, __FILE__, x); + +#define CONJUGATE(x) ((x) == CblasConjTrans) +#define TRANSPOSE(x) ((x) == CblasTrans || (x) == CblasConjTrans) +#define UPPER(x) ((x) == CblasUpper) +#define LOWER(x) ((x) == CblasLower) + +/* Handling of packed complex types... */ + +#define REAL(a,i) (((BASE *) a)[2*(i)]) +#define IMAG(a,i) (((BASE *) a)[2*(i)+1]) + +#define REAL0(a) (((BASE *)a)[0]) +#define IMAG0(a) (((BASE *)a)[1]) + +#define CONST_REAL(a,i) (((const BASE *) a)[2*(i)]) +#define CONST_IMAG(a,i) (((const BASE *) a)[2*(i)+1]) + +#define CONST_REAL0(a) (((const BASE *)a)[0]) +#define CONST_IMAG0(a) (((const BASE *)a)[1]) + + +#define GB(KU,KL,lda,i,j) ((KU+1+(i-j))*lda + j) + +#define TRCOUNT(N,i) ((((i)+1)*(2*(N)-(i)))/2) + +/* #define TBUP(N,i,j) */ +/* #define TBLO(N,i,j) */ + +#define TPUP(N,i,j) (TRCOUNT(N,(i)-1)+(j)-(i)) +#define TPLO(N,i,j) (((i)*((i)+1))/2 + (j)) + diff --git a/C/cblas/spec_ddot.v b/C/cblas/spec_ddot.v new file mode 100644 index 0000000..bfa1e5e --- /dev/null +++ b/C/cblas/spec_ddot.v @@ -0,0 +1,69 @@ +(** * LAProof.C.cblas.spec_ddot: VST specification of GSL's [cblas_ddot]. *) +(** ** Corresponds to C program [C/cblas/src/ddot.c] (ported from GSL cblas). *) + +(** This file connects a real-world BLAS implementation -- GSL's double-precision + dot product [cblas_ddot] -- to LAProof's functional dot-product model. + + GSL's loop (after macro expansion of [source_dot_r.h]) is +<< + double r = 0.0; + for (i = 0; i < N; i++) { r += X[ix] * Y[iy]; ix += incX; iy += incY; } + return r; +>> + i.e. a forward, left-to-right accumulation using *separate* multiply-then-add + (not fused multiply-add), starting from +0.0. That is exactly the + [dotprodF] model of [LAProof.accuracy_proofs.dotprod_model] + ([dotprodF = dotprod BMULT BPLUS pos_zero]), over which the forward-error + theorem [dot_acc.dotprod_forward_error] is proved -- *not* the FMA-based + [floatlib.dotprod]. + + Note on operand order: the C statement [r += X*Y] computes [BPLUS acc prod] + (accumulator first), whereas [dotprodF]'s fold step is [BPLUS prod acc] + (product first). IEEE addition is commutative *up to [feq]*, so we state the + postcondition with [feq] -- the same convention used by + [LAProof.C.spec_sparse] for [csr_row_vector_multiply_spec]. *) + +Require Import VST.floyd.proofauto. +From vcfloat Require Import FPStdCompCert FPStdLib. +From LAProof.C Require Import floatlib. +From LAProof.C.cblas Require Import ddot ddot_model. +(* [dotprod_model] only [Require Import]s (not [Export]s) the mathcomp preamble, + so this does not pollute the VST namespace with ssreflect notations. We need + it here for [dotprodF], which the postcondition is stated against. *) +Require Import LAProof.accuracy_proofs.dotprod_model. + +Set Bullet Behavior "Strict Subproofs". + +#[export] Instance CompSpecs : compspecs. make_compspecs prog. Defined. +Definition Vprog : varspecs. mk_varspecs prog. Defined. + +(** The partial-accumulation model [ddot_model] and its start/step/end lemmas + live in [LAProof.C.cblas.ddot_model] (mirroring the [sparse_model] / + [spec_sparse] split). This file states only the VST [funspec]. *) + +(** ** Funspec for [cblas_ddot] (unit-stride case [incX = incY = 1]). *) +(** General strides are out of scope: LAProof has no strided/gather + separation-logic predicate, and the unit-stride case is the one that maps + directly onto the list-based [dotprodF] model. *) +Definition cblas_ddot_spec := + DECLARE _cblas_ddot + WITH shX: share, shY: share, n: Z, + X: list (ftype Tdouble), Y: list (ftype Tdouble), + px: val, py: val + PRE [ tint, tptr tdouble, tint, tptr tdouble, tint ] + PROP (readable_share shX; readable_share shY; + Zlength X = n; Zlength Y = n; 0 <= n <= Int.max_signed; + Forall finite X; Forall finite Y) + PARAMS (Vint (Int.repr n); px; Vint (Int.repr 1); py; Vint (Int.repr 1)) + SEP (data_at shX (tarray tdouble n) (map Vfloat X) px; + data_at shY (tarray tdouble n) (map Vfloat Y) py) + POST [ tdouble ] + EX s: ftype Tdouble, + PROP (feq s (dotprodF X Y)) + RETURN (Vfloat s) + SEP (data_at shX (tarray tdouble n) (map Vfloat X) px; + data_at shY (tarray tdouble n) (map Vfloat Y) py). + +Definition DdotASI : funspecs := [ cblas_ddot_spec ]. +Definition ddot_imports : funspecs := []. (* no external calls: the loop is self-contained *) +Definition Gprog : funspecs := ddot_imports ++ DdotASI. diff --git a/C/cblas/src/ddot.c b/C/cblas/src/ddot.c new file mode 100644 index 0000000..e1f5e10 --- /dev/null +++ b/C/cblas/src/ddot.c @@ -0,0 +1,19 @@ +/* The following GSL headers are commented out for clightgen: they are + unused by cblas_ddot's body (only OFFSET/INDEX from "cblas.h" are needed) + and are unresolvable here because GSL's gsl/ symlink dir is not generated. */ +/* #include */ +/* #include */ +#include "cblas.h" + +double +cblas_ddot (const int N, const double *X, const int incX, const double *Y, + const int incY) +{ +#define INIT_VAL 0.0 +#define ACC_TYPE double +#define BASE double +#include "source_dot_r.h" +#undef ACC_TYPE +#undef BASE +#undef INIT_VAL +} diff --git a/C/cblas/src/source_dot_r.h b/C/cblas/src/source_dot_r.h new file mode 100644 index 0000000..b83f85e --- /dev/null +++ b/C/cblas/src/source_dot_r.h @@ -0,0 +1,33 @@ +/* blas/source_dot_r.h + * + * Copyright (C) 1996, 1997, 1998, 1999, 2000 Gerard Jungman + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or (at + * your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +{ + ACC_TYPE r = INIT_VAL; + INDEX i; + INDEX ix = OFFSET(N, incX); + INDEX iy = OFFSET(N, incY); + + for (i = 0; i < N; i++) { + r += X[ix] * Y[iy]; + ix += incX; + iy += incY; + } + + return r; +} diff --git a/C/cblas/verif_ddot.v b/C/cblas/verif_ddot.v new file mode 100644 index 0000000..29db101 --- /dev/null +++ b/C/cblas/verif_ddot.v @@ -0,0 +1,93 @@ +(** * LAProof.C.cblas.verif_ddot: VST proof scaffold for GSL's [cblas_ddot]. *) +(** ** Corresponds to C program [C/cblas/src/ddot.c]. *) + +(** [body_cblas_ddot] proves that the compiled C [cblas_ddot] (unit stride) + refines the [dotprodF] accuracy model, [feq]-equal to its result. The proof + mirrors [LAProof.C.verif_densemat_mult.body_densematn_dotprod]: a + [forward_for_simple_bound] whose invariant carries the partial accumulation + [ddot_model (sublist 0 k X) (sublist 0 k Y)], with the per-step / start / + bridge reasoning factored into the lemmas of [LAProof.C.cblas.ddot_model]. *) + +Require Import VST.floyd.proofauto. +From vcfloat Require Import FPStdCompCert FPStdLib. +From LAProof.C Require Import floatlib. +From LAProof.C.cblas Require Import ddot ddot_model spec_ddot. +Require Import LAProof.accuracy_proofs.dotprod_model. + +Set Bullet Behavior "Strict Subproofs". + +Lemma body_cblas_ddot: semax_body Vprog Gprog f_cblas_ddot cblas_ddot_spec. +Proof. +start_function. +(* C: r = 0.0; ix = OFFSET(N,incX); iy = OFFSET(N,incY); with incX=incY=1 ⇒ ix=iy=0. + Each OFFSET is a (incX>0 ? 0 : ...) conditional; the else-branch is + unreachable here because incX=incY=1, so [rep_lia] closes it. *) +forward. (* _r = 0.0 *) +forward_if (PROP () + LOCAL (temp _t'1 (Vint (Int.repr 0)); + temp _r (Vfloat (Float.of_bits (Int64.repr 0))); + temp _N (Vint (Int.repr n)); temp _X px; + temp _incX (Vint (Int.repr 1)); temp _Y py; + temp _incY (Vint (Int.repr 1))) + SEP (data_at shX (tarray tdouble n) (map Vfloat X) px; + data_at shY (tarray tdouble n) (map Vfloat Y) py)). +{ forward. entailer!!. } (* incX = 1 > 0 : t'1 := 0 *) +{ rep_lia. } (* dead else-branch (incX = 1 is > 0) *) +forward. (* _ix = _t'1 (= 0) *) +forward_if (PROP () + LOCAL (temp _t'2 (Vint (Int.repr 0)); + temp _ix (Vint (Int.repr 0)); + temp _r (Vfloat (Float.of_bits (Int64.repr 0))); + temp _N (Vint (Int.repr n)); temp _X px; + temp _incX (Vint (Int.repr 1)); temp _Y py; + temp _incY (Vint (Int.repr 1))) + SEP (data_at shX (tarray tdouble n) (map Vfloat X) px; + data_at shY (tarray tdouble n) (map Vfloat Y) py)). +{ forward. entailer!!. } (* incY = 1 > 0 : t'2 := 0 *) +{ rep_lia. } (* dead else-branch (incY = 1 is > 0) *) +forward. (* _iy = _t'2 (= 0) *) +(* The accumulation loop. Invariant: _r holds the value the C loop has + accumulated over the first k element-pairs, namely [ddot_model] of the + length-k prefixes; the strided indices ix,iy advance by 1 each step (= k). *) +forward_for_simple_bound n + (EX k:Z, + PROP () + LOCAL (temp _r (Vfloat (ddot_model (sublist 0 k X) (sublist 0 k Y))); + temp _ix (Vint (Int.repr k)); + temp _iy (Vint (Int.repr k)); + temp _N (Vint (Int.repr n)); + temp _incX (Vint (Int.repr 1)); + temp _incY (Vint (Int.repr 1)); + temp _X px; temp _Y py) + SEP (data_at shX (tarray tdouble n) (map Vfloat X) px; + data_at shY (tarray tdouble n) (map Vfloat Y) py))%assert. +- (* Loop entry (k = 0). [sublist 0 0 = nil] and [ddot_model nil nil = + Zconst Tdouble 0] (= the C [+0.0]); [entailer!!] discharges it. *) + entailer!!. +- (* Loop body: read X[i], Y[i]; [r = r + X[i]*Y[i]]; ix++; iy++. Re-establish + the invariant for [i+1] via the step lemma [ddot_model_step]. *) + forward. (* t'3 = X[ix] *) + forward. (* t'4 = Y[iy] *) + forward. (* r = r + t'3 * t'4 *) + forward. (* ix = ix + incX *) + forward. (* iy = iy + incY *) + assert (Hi: 0 <= i < Zlength X) by list_solve. + assert (Hxy: Zlength X = Zlength Y) by list_solve. + entailer!. + rewrite (ddot_model_step X Y i Hxy Hi). + reflexivity. +- (* Loop exit (k = n). Return [r = ddot_model (sublist 0 n X) (sublist 0 n Y)]; + [sublist_same] gives [= ddot_model X Y], and the bridge lemma + [ddot_model_feq_dotprodF] discharges [feq _ (dotprodF X Y)]. *) + forward. (* return r *) + Exists (ddot_model (sublist 0 (Zlength X) X) (sublist 0 (Zlength X) Y)). + entailer!. + rewrite (sublist_same 0 (Zlength X) X) by lia. + rewrite (sublist_same 0 (Zlength X) Y) by lia. + apply ddot_model_feq_dotprodF; [ lia | auto | auto ]. +Qed. + +(** Payoff (stub): because [cblas_ddot_spec]'s result is [feq]-equal to + [dotprodF X Y], the forward-error bound + [LAProof.accuracy_proofs.dot_acc.dotprod_forward_error] applies directly to + the value computed by the compiled C [cblas_ddot]. *) diff --git a/Makefile b/Makefile index 574bac7..87f7aa3 100644 --- a/Makefile +++ b/Makefile @@ -20,6 +20,7 @@ clean: if [ -e Makefile.coq ]; then $(MAKE) -f Makefile.coq cleanall; fi $(RM) Makefile.coq Makefile.coq.conf $(RM) C/sparse.v C/cholesky.v C/alloc.v C/densemat.v C/bandmat.v + $(RM) C/cblas/ddot.v INSTALLFILES1 ?= $(shell awk '/accuracy_proofs/{print $$NF"o"}' _CoqProject) diff --git a/_CoqProject b/_CoqProject index a404827..86abee6 100644 --- a/_CoqProject +++ b/_CoqProject @@ -47,3 +47,8 @@ C/build_csr.v C/verif_build_csr.v C/distinct.v C/partial_csr.v + +C/cblas/ddot.v +C/cblas/ddot_model.v +C/cblas/spec_ddot.v +C/cblas/verif_ddot.v From 2e07ae4e11d8e7d2b9d08511949b6e91c939a2bb Mon Sep 17 00:00:00 2001 From: Ariel Kellison Date: Sat, 30 May 2026 19:33:21 -0400 Subject: [PATCH 2/8] Add verif of cblas dasum --- C/cblas/README.md | 12 +- C/cblas/asum_model.v | 83 ++++++++ C/cblas/dasum.v | 414 ++++++++++++++++++++++++++++++++++++ C/cblas/spec_dasum.v | 39 ++++ C/cblas/src/dasum.c | 16 ++ C/cblas/src/source_asum_r.h | 34 +++ C/cblas/verif_dasum.v | 60 ++++++ Makefile | 2 +- _CoqProject | 5 + 9 files changed, 660 insertions(+), 5 deletions(-) create mode 100644 C/cblas/asum_model.v create mode 100644 C/cblas/dasum.v create mode 100644 C/cblas/spec_dasum.v create mode 100644 C/cblas/src/dasum.c create mode 100644 C/cblas/src/source_asum_r.h create mode 100644 C/cblas/verif_dasum.v diff --git a/C/cblas/README.md b/C/cblas/README.md index 2c24c23..6935b72 100644 --- a/C/cblas/README.md +++ b/C/cblas/README.md @@ -83,14 +83,18 @@ copyright and license headers. | Operation | Precision | Function | Files | |-----------|-----------|----------|-------| | Dot product | double | `cblas_ddot` | [`src/ddot.c`](src/ddot.c), [`src/source_dot_r.h`](src/source_dot_r.h), [`include/cblas.h`](include/cblas.h), [`ddot.v`](ddot.v), [`ddot_model.v`](ddot_model.v), [`spec_ddot.v`](spec_ddot.v), [`verif_ddot.v`](verif_ddot.v) | +| Sum of absolute values | double | `cblas_dasum` | [`src/dasum.c`](src/dasum.c), [`src/source_asum_r.h`](src/source_asum_r.h), [`include/cblas.h`](include/cblas.h), [`dasum.v`](dasum.v), [`asum_model.v`](asum_model.v), [`spec_dasum.v`](spec_dasum.v), [`verif_dasum.v`](verif_dasum.v) | -**Scope limits (`cblas_ddot`):** +**Scope limits (`cblas_ddot`, `cblas_dasum`):** - **Unit stride only** (`incX = incY = 1`). General strides are out of scope: LAProof has no strided/gather separation-logic predicate, and the unit-stride case is the one that - maps directly onto the list-based dot-product model. + maps directly onto the list-based models. - The postcondition is stated up to `feq` (IEEE-equality that identifies `+0.0`/`-0.0` and - NaN payloads), because the C accumulation adds the accumulator first while the model - folds product-first, and IEEE addition is commutative only up to `feq`. + NaN payloads), because the C accumulation adds the accumulator first while the models fold + the new term first, and IEEE addition is commutative only up to `feq`. +- `cblas_dasum` is verified against `sumF (map BABS X)`; its loop body calls `fabs`, + discharged with VSTlib's `fabs_spec` (whose result is `BABS`). The matching accuracy + theorem is `sum_acc.fSUM`. ## Conventions diff --git a/C/cblas/asum_model.v b/C/cblas/asum_model.v new file mode 100644 index 0000000..fefb051 --- /dev/null +++ b/C/cblas/asum_model.v @@ -0,0 +1,83 @@ +(** * LAProof.C.cblas.asum_model: functional model of GSL's [cblas_dasum]. *) +(** ** Corresponds to C program [C/cblas/src/dasum.c] (ported from GSL cblas). *) + +(** Sibling of [LAProof.C.cblas.ddot_model]: the partial-accumulation *model + function* the [cblas_dasum] loop invariant tracks, with the *start* / *step* + / *end* lemmas about it. GSL's kernel ([source_asum_r.h]) is +<< + double r = 0.0; + for (i = 0; i < N; i++) { r += fabs(X[ix]); ix += incX; } + return r; +>> + a forward accumulation of [|X[i]|] with separate abs-then-add, accumulator + first ([BPLUS acc |x|]), starting from [+0.0]. The C [fabs] call computes + [BABS] (the result of VSTlib's [fabs_spec], whose [ff_func] is [BABS]), so the + model uses [BABS]. + + The matching LAProof accuracy model is [sum_model.sumF] applied to the list + [map BABS X] (the absolute values); over it [sum_acc.fSUM] proves the + forward-error bound. As with [ddot], the C's operand order ([BPLUS acc x] + vs [sumF]'s [BPLUS x acc]) and initial accumulator ([+0.0] = [Zconst Tdouble 0] + vs [sumF]'s [neg_zero]) differ, so the end/bridge lemma relates the two by + [feq] (IEEE-equality identifying [±0]/NaN). *) + +Require Import VST.floyd.proofauto. +Require Import vcfloat.VCFloat. +Require Import vcfloat.FPStdCompCert. +From vcfloat Require Import FPStdLib. +From LAProof.C Require Import floatlib. +(* [sum_model] only [Require Import]s (not [Export]s) the mathcomp preamble, so + this does not pollute the VST namespace with ssreflect notations. We do NOT + [Import common] (it would re-scope [<=]/[<] over [R]); [common.BPLUS_comm] / + [common.neg_zero] are used qualified. [BPLUS_mor], [BABS], the [feq] + [Equivalence] come from vcfloat. *) +Require Import LAProof.accuracy_proofs.sum_model. + +Set Bullet Behavior "Strict Subproofs". + +(** [asum_model X] is the value the C loop *literally* computes: a left fold with + the accumulator as the first [BPLUS] operand, separate abs-then-add, starting + from [+0.0]. *) +Definition asum_loop (l: list (ftype Tdouble)) : ftype Tdouble := + fold_left (fun acc x => BPLUS acc (BABS x)) l (Zconst Tdouble 0). + +Definition asum_model (X: list (ftype Tdouble)) : ftype Tdouble := asum_loop X. + +(** ** Model lemmas (the [partial_row_*] triple, specialised to a dense asum). *) + +Lemma asum_loop_snoc: forall l x, + asum_loop (l ++ [x]) = BPLUS (asum_loop l) (BABS x). +Proof. intros. unfold asum_loop. rewrite fold_left_app. reflexivity. Qed. + +(** *step*: extending the length-[k] prefix by its [k]-th element adds one + [BABS] term, with the accumulator as the first [BPLUS] operand -- exactly the + Clight statement [r = r + fabs(X[k])]. *) +Lemma asum_model_step: forall (X: list (ftype Tdouble)) k, + 0 <= k < Zlength X -> + asum_model (sublist 0 (k+1) X) + = BPLUS (asum_model (sublist 0 k X)) (BABS (Znth k X)). +Proof. + intros X k Hk. unfold asum_model. + rewrite (sublist_last_1 0 k X) by lia. + rewrite asum_loop_snoc. reflexivity. +Qed. + +(** *end*/bridge: the finished accumulation equals LAProof's summation model + [sumF (map BABS X)] up to [feq]. Generalized-accumulator induction using + [common.BPLUS_comm] and [BPLUS_mor], with the [feq (+0.0) (-0.0)] base case + ([sumF] starts at [neg_zero]). *) +Lemma asum_model_feq_sumF: forall (X: list (ftype Tdouble)), + feq (asum_model X) (sumF (map BABS X)). +Proof. + intros X. unfold asum_model, asum_loop, sumF. + assert (Hs: feq (Zconst Tdouble 0) (@common.neg_zero Tdouble)) by reflexivity. + revert Hs. + generalize (Zconst Tdouble 0). + generalize (@common.neg_zero Tdouble). + induction X as [|x X IH]; intros g f Hs. + - (* X = nil: both folds are the (feq-linked) initial accumulators. *) + simpl. exact Hs. + - simpl. apply IH. + unfold Basics.flip. + etransitivity; [ apply common.BPLUS_comm | apply BPLUS_mor; [ reflexivity | exact Hs ] ]. +Qed. diff --git a/C/cblas/dasum.v b/C/cblas/dasum.v new file mode 100644 index 0000000..ff4839a --- /dev/null +++ b/C/cblas/dasum.v @@ -0,0 +1,414 @@ +From Coq Require Import String List ZArith. +From compcert Require Import Coqlib Integers Floats AST Ctypes Cop Clight Clightdefs. +Import Clightdefs.ClightNotations. +Local Open Scope Z_scope. +Local Open Scope string_scope. +Local Open Scope clight_scope. + +Module Info. + Definition version := "3.17". + Definition build_number := "". + Definition build_tag := "". + Definition build_branch := "". + Definition arch := "aarch64". + Definition model := "default". + Definition abi := "apple". + Definition bitsize := 64. + Definition big_endian := false. + Definition source_file := "C/cblas/src/dasum.c". + Definition normalized := true. +End Info. + +Definition _N : ident := $"N". +Definition _X : ident := $"X". +Definition ___builtin_annot : ident := $"__builtin_annot". +Definition ___builtin_annot_intval : ident := $"__builtin_annot_intval". +Definition ___builtin_bswap : ident := $"__builtin_bswap". +Definition ___builtin_bswap16 : ident := $"__builtin_bswap16". +Definition ___builtin_bswap32 : ident := $"__builtin_bswap32". +Definition ___builtin_bswap64 : ident := $"__builtin_bswap64". +Definition ___builtin_cls : ident := $"__builtin_cls". +Definition ___builtin_clsl : ident := $"__builtin_clsl". +Definition ___builtin_clsll : ident := $"__builtin_clsll". +Definition ___builtin_clz : ident := $"__builtin_clz". +Definition ___builtin_clzl : ident := $"__builtin_clzl". +Definition ___builtin_clzll : ident := $"__builtin_clzll". +Definition ___builtin_ctz : ident := $"__builtin_ctz". +Definition ___builtin_ctzl : ident := $"__builtin_ctzl". +Definition ___builtin_ctzll : ident := $"__builtin_ctzll". +Definition ___builtin_debug : ident := $"__builtin_debug". +Definition ___builtin_expect : ident := $"__builtin_expect". +Definition ___builtin_fabs : ident := $"__builtin_fabs". +Definition ___builtin_fabsf : ident := $"__builtin_fabsf". +Definition ___builtin_fmadd : ident := $"__builtin_fmadd". +Definition ___builtin_fmax : ident := $"__builtin_fmax". +Definition ___builtin_fmin : ident := $"__builtin_fmin". +Definition ___builtin_fmsub : ident := $"__builtin_fmsub". +Definition ___builtin_fnmadd : ident := $"__builtin_fnmadd". +Definition ___builtin_fnmsub : ident := $"__builtin_fnmsub". +Definition ___builtin_fsqrt : ident := $"__builtin_fsqrt". +Definition ___builtin_membar : ident := $"__builtin_membar". +Definition ___builtin_memcpy_aligned : ident := $"__builtin_memcpy_aligned". +Definition ___builtin_sel : ident := $"__builtin_sel". +Definition ___builtin_sqrt : ident := $"__builtin_sqrt". +Definition ___builtin_unreachable : ident := $"__builtin_unreachable". +Definition ___builtin_va_arg : ident := $"__builtin_va_arg". +Definition ___builtin_va_copy : ident := $"__builtin_va_copy". +Definition ___builtin_va_end : ident := $"__builtin_va_end". +Definition ___builtin_va_start : ident := $"__builtin_va_start". +Definition ___compcert_i64_dtos : ident := $"__compcert_i64_dtos". +Definition ___compcert_i64_dtou : ident := $"__compcert_i64_dtou". +Definition ___compcert_i64_sar : ident := $"__compcert_i64_sar". +Definition ___compcert_i64_sdiv : ident := $"__compcert_i64_sdiv". +Definition ___compcert_i64_shl : ident := $"__compcert_i64_shl". +Definition ___compcert_i64_shr : ident := $"__compcert_i64_shr". +Definition ___compcert_i64_smod : ident := $"__compcert_i64_smod". +Definition ___compcert_i64_smulh : ident := $"__compcert_i64_smulh". +Definition ___compcert_i64_stod : ident := $"__compcert_i64_stod". +Definition ___compcert_i64_stof : ident := $"__compcert_i64_stof". +Definition ___compcert_i64_udiv : ident := $"__compcert_i64_udiv". +Definition ___compcert_i64_umod : ident := $"__compcert_i64_umod". +Definition ___compcert_i64_umulh : ident := $"__compcert_i64_umulh". +Definition ___compcert_i64_utod : ident := $"__compcert_i64_utod". +Definition ___compcert_i64_utof : ident := $"__compcert_i64_utof". +Definition ___compcert_va_composite : ident := $"__compcert_va_composite". +Definition ___compcert_va_float64 : ident := $"__compcert_va_float64". +Definition ___compcert_va_int32 : ident := $"__compcert_va_int32". +Definition ___compcert_va_int64 : ident := $"__compcert_va_int64". +Definition _cblas_dasum : ident := $"cblas_dasum". +Definition _fabs : ident := $"fabs". +Definition _i : ident := $"i". +Definition _incX : ident := $"incX". +Definition _ix : ident := $"ix". +Definition _main : ident := $"main". +Definition _r : ident := $"r". +Definition _t'1 : ident := 128%positive. +Definition _t'2 : ident := 129%positive. + +Definition f_cblas_dasum := {| + fn_return := tdouble; + fn_callconv := cc_default; + fn_params := ((_N, tint) :: (_X, (tptr tdouble)) :: (_incX, tint) :: nil); + fn_vars := nil; + fn_temps := ((_r, tdouble) :: (_i, tint) :: (_ix, tint) :: + (_t'1, tdouble) :: (_t'2, tdouble) :: nil); + fn_body := +(Ssequence + (Sset _r (Econst_float (Float.of_bits (Int64.repr 0)) tdouble)) + (Ssequence + (Sset _ix (Econst_int (Int.repr 0) tint)) + (Ssequence + (Sifthenelse (Ebinop Ole (Etempvar _incX tint) + (Econst_int (Int.repr 0) tint) tint) + (Sreturn (Some (Econst_int (Int.repr 0) tint))) + Sskip) + (Ssequence + (Ssequence + (Sset _i (Econst_int (Int.repr 0) tint)) + (Sloop + (Ssequence + (Sifthenelse (Ebinop Olt (Etempvar _i tint) (Etempvar _N tint) + tint) + Sskip + Sbreak) + (Ssequence + (Ssequence + (Ssequence + (Sset _t'2 + (Ederef + (Ebinop Oadd (Etempvar _X (tptr tdouble)) + (Etempvar _ix tint) (tptr tdouble)) tdouble)) + (Scall (Some _t'1) + (Evar _fabs (Tfunction (tdouble :: nil) tdouble + cc_default)) + ((Etempvar _t'2 tdouble) :: nil))) + (Sset _r + (Ebinop Oadd (Etempvar _r tdouble) + (Etempvar _t'1 tdouble) tdouble))) + (Sset _ix + (Ebinop Oadd (Etempvar _ix tint) (Etempvar _incX tint) + tint)))) + (Sset _i + (Ebinop Oadd (Etempvar _i tint) (Econst_int (Int.repr 1) tint) + tint)))) + (Sreturn (Some (Etempvar _r tdouble))))))) +|}. + +Definition composites : list composite_definition := +nil. + +Definition global_definitions : list (ident * globdef fundef type) := +((___compcert_va_int32, + Gfun(External (EF_runtime "__compcert_va_int32" + (mksignature (AST.Xptr :: nil) AST.Xint cc_default)) + ((tptr tvoid) :: nil) tuint cc_default)) :: + (___compcert_va_int64, + Gfun(External (EF_runtime "__compcert_va_int64" + (mksignature (AST.Xptr :: nil) AST.Xlong cc_default)) + ((tptr tvoid) :: nil) tulong cc_default)) :: + (___compcert_va_float64, + Gfun(External (EF_runtime "__compcert_va_float64" + (mksignature (AST.Xptr :: nil) AST.Xfloat cc_default)) + ((tptr tvoid) :: nil) tdouble cc_default)) :: + (___compcert_va_composite, + Gfun(External (EF_runtime "__compcert_va_composite" + (mksignature (AST.Xptr :: AST.Xlong :: nil) AST.Xptr + cc_default)) ((tptr tvoid) :: tulong :: nil) + (tptr tvoid) cc_default)) :: + (___compcert_i64_dtos, + Gfun(External (EF_runtime "__compcert_i64_dtos" + (mksignature (AST.Xfloat :: nil) AST.Xlong cc_default)) + (tdouble :: nil) tlong cc_default)) :: + (___compcert_i64_dtou, + Gfun(External (EF_runtime "__compcert_i64_dtou" + (mksignature (AST.Xfloat :: nil) AST.Xlong cc_default)) + (tdouble :: nil) tulong cc_default)) :: + (___compcert_i64_stod, + Gfun(External (EF_runtime "__compcert_i64_stod" + (mksignature (AST.Xlong :: nil) AST.Xfloat cc_default)) + (tlong :: nil) tdouble cc_default)) :: + (___compcert_i64_utod, + Gfun(External (EF_runtime "__compcert_i64_utod" + (mksignature (AST.Xlong :: nil) AST.Xfloat cc_default)) + (tulong :: nil) tdouble cc_default)) :: + (___compcert_i64_stof, + Gfun(External (EF_runtime "__compcert_i64_stof" + (mksignature (AST.Xlong :: nil) AST.Xsingle cc_default)) + (tlong :: nil) tfloat cc_default)) :: + (___compcert_i64_utof, + Gfun(External (EF_runtime "__compcert_i64_utof" + (mksignature (AST.Xlong :: nil) AST.Xsingle cc_default)) + (tulong :: nil) tfloat cc_default)) :: + (___compcert_i64_sdiv, + Gfun(External (EF_runtime "__compcert_i64_sdiv" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___compcert_i64_udiv, + Gfun(External (EF_runtime "__compcert_i64_udiv" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tulong :: tulong :: nil) tulong + cc_default)) :: + (___compcert_i64_smod, + Gfun(External (EF_runtime "__compcert_i64_smod" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___compcert_i64_umod, + Gfun(External (EF_runtime "__compcert_i64_umod" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tulong :: tulong :: nil) tulong + cc_default)) :: + (___compcert_i64_shl, + Gfun(External (EF_runtime "__compcert_i64_shl" + (mksignature (AST.Xlong :: AST.Xint :: nil) AST.Xlong + cc_default)) (tlong :: tint :: nil) tlong cc_default)) :: + (___compcert_i64_shr, + Gfun(External (EF_runtime "__compcert_i64_shr" + (mksignature (AST.Xlong :: AST.Xint :: nil) AST.Xlong + cc_default)) (tulong :: tint :: nil) tulong cc_default)) :: + (___compcert_i64_sar, + Gfun(External (EF_runtime "__compcert_i64_sar" + (mksignature (AST.Xlong :: AST.Xint :: nil) AST.Xlong + cc_default)) (tlong :: tint :: nil) tlong cc_default)) :: + (___compcert_i64_smulh, + Gfun(External (EF_runtime "__compcert_i64_smulh" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___compcert_i64_umulh, + Gfun(External (EF_runtime "__compcert_i64_umulh" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tulong :: tulong :: nil) tulong + cc_default)) :: + (___builtin_bswap64, + Gfun(External (EF_builtin "__builtin_bswap64" + (mksignature (AST.Xlong :: nil) AST.Xlong cc_default)) + (tulong :: nil) tulong cc_default)) :: + (___builtin_bswap, + Gfun(External (EF_builtin "__builtin_bswap" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tuint cc_default)) :: + (___builtin_bswap32, + Gfun(External (EF_builtin "__builtin_bswap32" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tuint cc_default)) :: + (___builtin_bswap16, + Gfun(External (EF_builtin "__builtin_bswap16" + (mksignature (AST.Xint16unsigned :: nil) + AST.Xint16unsigned cc_default)) (tushort :: nil) tushort + cc_default)) :: + (___builtin_clz, + Gfun(External (EF_builtin "__builtin_clz" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tint cc_default)) :: + (___builtin_clzl, + Gfun(External (EF_builtin "__builtin_clzl" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_clzll, + Gfun(External (EF_builtin "__builtin_clzll" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_ctz, + Gfun(External (EF_builtin "__builtin_ctz" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tint cc_default)) :: + (___builtin_ctzl, + Gfun(External (EF_builtin "__builtin_ctzl" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_ctzll, + Gfun(External (EF_builtin "__builtin_ctzll" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_fabs, + Gfun(External (EF_builtin "__builtin_fabs" + (mksignature (AST.Xfloat :: nil) AST.Xfloat cc_default)) + (tdouble :: nil) tdouble cc_default)) :: + (___builtin_fabsf, + Gfun(External (EF_builtin "__builtin_fabsf" + (mksignature (AST.Xsingle :: nil) AST.Xsingle cc_default)) + (tfloat :: nil) tfloat cc_default)) :: + (___builtin_fsqrt, + Gfun(External (EF_builtin "__builtin_fsqrt" + (mksignature (AST.Xfloat :: nil) AST.Xfloat cc_default)) + (tdouble :: nil) tdouble cc_default)) :: + (___builtin_sqrt, + Gfun(External (EF_builtin "__builtin_sqrt" + (mksignature (AST.Xfloat :: nil) AST.Xfloat cc_default)) + (tdouble :: nil) tdouble cc_default)) :: + (___builtin_memcpy_aligned, + Gfun(External (EF_builtin "__builtin_memcpy_aligned" + (mksignature + (AST.Xptr :: AST.Xptr :: AST.Xlong :: AST.Xlong :: nil) + AST.Xvoid cc_default)) + ((tptr tvoid) :: (tptr tvoid) :: tulong :: tulong :: nil) tvoid + cc_default)) :: + (___builtin_sel, + Gfun(External (EF_builtin "__builtin_sel" + (mksignature (AST.Xbool :: nil) AST.Xvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) + (tbool :: nil) tvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) :: + (___builtin_annot, + Gfun(External (EF_builtin "__builtin_annot" + (mksignature (AST.Xptr :: nil) AST.Xvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) + ((tptr tschar) :: nil) tvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) :: + (___builtin_annot_intval, + Gfun(External (EF_builtin "__builtin_annot_intval" + (mksignature (AST.Xptr :: AST.Xint :: nil) AST.Xint + cc_default)) ((tptr tschar) :: tint :: nil) tint + cc_default)) :: + (___builtin_membar, + Gfun(External (EF_builtin "__builtin_membar" + (mksignature nil AST.Xvoid cc_default)) nil tvoid + cc_default)) :: + (___builtin_va_start, + Gfun(External (EF_builtin "__builtin_va_start" + (mksignature (AST.Xptr :: nil) AST.Xvoid cc_default)) + ((tptr tvoid) :: nil) tvoid cc_default)) :: + (___builtin_va_arg, + Gfun(External (EF_builtin "__builtin_va_arg" + (mksignature (AST.Xptr :: AST.Xint :: nil) AST.Xvoid + cc_default)) ((tptr tvoid) :: tuint :: nil) tvoid + cc_default)) :: + (___builtin_va_copy, + Gfun(External (EF_builtin "__builtin_va_copy" + (mksignature (AST.Xptr :: AST.Xptr :: nil) AST.Xvoid + cc_default)) ((tptr tvoid) :: (tptr tvoid) :: nil) tvoid + cc_default)) :: + (___builtin_va_end, + Gfun(External (EF_builtin "__builtin_va_end" + (mksignature (AST.Xptr :: nil) AST.Xvoid cc_default)) + ((tptr tvoid) :: nil) tvoid cc_default)) :: + (___builtin_unreachable, + Gfun(External (EF_builtin "__builtin_unreachable" + (mksignature nil AST.Xvoid cc_default)) nil tvoid + cc_default)) :: + (___builtin_expect, + Gfun(External (EF_builtin "__builtin_expect" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___builtin_cls, + Gfun(External (EF_builtin "__builtin_cls" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tint :: nil) tint cc_default)) :: + (___builtin_clsl, + Gfun(External (EF_builtin "__builtin_clsl" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tlong :: nil) tint cc_default)) :: + (___builtin_clsll, + Gfun(External (EF_builtin "__builtin_clsll" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tlong :: nil) tint cc_default)) :: + (___builtin_fmadd, + Gfun(External (EF_builtin "__builtin_fmadd" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fmsub, + Gfun(External (EF_builtin "__builtin_fmsub" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fnmadd, + Gfun(External (EF_builtin "__builtin_fnmadd" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fnmsub, + Gfun(External (EF_builtin "__builtin_fnmsub" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fmax, + Gfun(External (EF_builtin "__builtin_fmax" + (mksignature (AST.Xfloat :: AST.Xfloat :: nil) AST.Xfloat + cc_default)) (tdouble :: tdouble :: nil) tdouble + cc_default)) :: + (___builtin_fmin, + Gfun(External (EF_builtin "__builtin_fmin" + (mksignature (AST.Xfloat :: AST.Xfloat :: nil) AST.Xfloat + cc_default)) (tdouble :: tdouble :: nil) tdouble + cc_default)) :: + (___builtin_debug, + Gfun(External (EF_external "__builtin_debug" + (mksignature (AST.Xint :: nil) AST.Xvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) + (tint :: nil) tvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) :: + (_fabs, + Gfun(External (EF_external "fabs" + (mksignature (AST.Xfloat :: nil) AST.Xfloat cc_default)) + (tdouble :: nil) tdouble cc_default)) :: + (_cblas_dasum, Gfun(Internal f_cblas_dasum)) :: nil). + +Definition public_idents : list ident := +(_cblas_dasum :: _fabs :: ___builtin_debug :: ___builtin_fmin :: + ___builtin_fmax :: ___builtin_fnmsub :: ___builtin_fnmadd :: + ___builtin_fmsub :: ___builtin_fmadd :: ___builtin_clsll :: + ___builtin_clsl :: ___builtin_cls :: ___builtin_expect :: + ___builtin_unreachable :: ___builtin_va_end :: ___builtin_va_copy :: + ___builtin_va_arg :: ___builtin_va_start :: ___builtin_membar :: + ___builtin_annot_intval :: ___builtin_annot :: ___builtin_sel :: + ___builtin_memcpy_aligned :: ___builtin_sqrt :: ___builtin_fsqrt :: + ___builtin_fabsf :: ___builtin_fabs :: ___builtin_ctzll :: + ___builtin_ctzl :: ___builtin_ctz :: ___builtin_clzll :: ___builtin_clzl :: + ___builtin_clz :: ___builtin_bswap16 :: ___builtin_bswap32 :: + ___builtin_bswap :: ___builtin_bswap64 :: ___compcert_i64_umulh :: + ___compcert_i64_smulh :: ___compcert_i64_sar :: ___compcert_i64_shr :: + ___compcert_i64_shl :: ___compcert_i64_umod :: ___compcert_i64_smod :: + ___compcert_i64_udiv :: ___compcert_i64_sdiv :: ___compcert_i64_utof :: + ___compcert_i64_stof :: ___compcert_i64_utod :: ___compcert_i64_stod :: + ___compcert_i64_dtou :: ___compcert_i64_dtos :: ___compcert_va_composite :: + ___compcert_va_float64 :: ___compcert_va_int64 :: ___compcert_va_int32 :: + nil). + +Definition prog : Clight.program := + mkprogram composites global_definitions public_idents _main Logic.I. + + diff --git a/C/cblas/spec_dasum.v b/C/cblas/spec_dasum.v new file mode 100644 index 0000000..8c2d6f7 --- /dev/null +++ b/C/cblas/spec_dasum.v @@ -0,0 +1,39 @@ +(** * LAProof.C.cblas.spec_dasum: VST specification of GSL's [cblas_dasum]. *) +(** ** Corresponds to C program [C/cblas/src/dasum.c] (ported from GSL cblas). *) + +(** Sibling of [LAProof.C.cblas.spec_ddot]. [cblas_dasum] computes the sum of + absolute values of a vector; the postcondition relates its result (up to + [feq]) to LAProof's summation model [sumF] applied to [map BABS X], over + which [sum_acc.fSUM] proves the forward-error bound. *) + +Require Import VST.floyd.proofauto. +From vcfloat Require Import FPStdCompCert FPStdLib. +From VSTlib Require Import spec_math. +From LAProof.C Require Import floatlib. +From LAProof.C.cblas Require Import dasum asum_model. +(* for [sumF]; [Require Import] (not [Export]) keeps ssreflect out of scope. *) +Require Import LAProof.accuracy_proofs.sum_model. + +Set Bullet Behavior "Strict Subproofs". + +#[export] Instance CompSpecs : compspecs. make_compspecs prog. Defined. +Definition Vprog : varspecs. mk_varspecs prog. Defined. + +(** ** Funspec for [cblas_dasum] (unit-stride case [incX = 1]). *) +Definition cblas_dasum_spec := + DECLARE _cblas_dasum + WITH shX: share, n: Z, X: list (ftype Tdouble), px: val + PRE [ tint, tptr tdouble, tint ] + PROP (readable_share shX; Zlength X = n; 0 <= n <= Int.max_signed; + Forall finite X) + PARAMS (Vint (Int.repr n); px; Vint (Int.repr 1)) + SEP (data_at shX (tarray tdouble n) (map Vfloat X) px) + POST [ tdouble ] + EX s: ftype Tdouble, + PROP (feq s (sumF (map BABS X))) + RETURN (Vfloat s) + SEP (data_at shX (tarray tdouble n) (map Vfloat X) px). + +Definition DasumASI : funspecs := [ cblas_dasum_spec ]. +Definition dasum_imports : funspecs := [ fabs_spec ]. (* fabs is the one external call *) +Definition Gprog : funspecs := dasum_imports ++ DasumASI. diff --git a/C/cblas/src/dasum.c b/C/cblas/src/dasum.c new file mode 100644 index 0000000..8610871 --- /dev/null +++ b/C/cblas/src/dasum.c @@ -0,0 +1,16 @@ +/* The following GSL headers are commented out for clightgen: they are + unresolvable here (GSL's gsl/ symlink dir is not generated) and contribute + nothing to cblas_dasum's body beyond the libc fabs, which we obtain directly + from (as the other LAProof C sources do, e.g. densemat.c). */ +/* #include */ +/* #include */ +#include +#include "cblas.h" + +double +cblas_dasum (const int N, const double *X, const int incX) +{ +#define BASE double +#include "source_asum_r.h" +#undef BASE +} diff --git a/C/cblas/src/source_asum_r.h b/C/cblas/src/source_asum_r.h new file mode 100644 index 0000000..863619e --- /dev/null +++ b/C/cblas/src/source_asum_r.h @@ -0,0 +1,34 @@ +/* blas/source_asum_r.h + * + * Copyright (C) 1996, 1997, 1998, 1999, 2000 Gerard Jungman + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or (at + * your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +{ + BASE r = 0.0; + INDEX i; + INDEX ix = 0; + + if (incX <= 0) { + return 0; + } + + for (i = 0; i < N; i++) { + r += fabs(X[ix]); + ix += incX; + } + return r; +} diff --git a/C/cblas/verif_dasum.v b/C/cblas/verif_dasum.v new file mode 100644 index 0000000..4471d55 --- /dev/null +++ b/C/cblas/verif_dasum.v @@ -0,0 +1,60 @@ +(** * LAProof.C.cblas.verif_dasum: VST proof for GSL's [cblas_dasum]. *) +(** ** Corresponds to C program [C/cblas/src/dasum.c]. *) + +(** [body_cblas_dasum] proves that the compiled C [cblas_dasum] (unit stride) + refines LAProof's summation model: its result is [feq]-equal to + [sumF (map BABS X)]. Sibling of [verif_ddot]; the loop invariant carries the + partial accumulation [asum_model (sublist 0 k X)], with the per-step / start / + bridge reasoning in [LAProof.C.cblas.asum_model]. The only structural + additions over [ddot] are an [fabs] [forward_call] in the loop body and the + dead [if (incX<=0) return 0;] guard (unreachable since [incX = 1]). *) + +Require Import VST.floyd.proofauto. +From vcfloat Require Import FPStdCompCert FPStdLib. +From VSTlib Require Import spec_math. +From LAProof.C Require Import floatlib. +From LAProof.C.cblas Require Import dasum asum_model spec_dasum. +Require Import LAProof.accuracy_proofs.sum_model. + +Set Bullet Behavior "Strict Subproofs". + +Lemma body_cblas_dasum: semax_body Vprog Gprog f_cblas_dasum cblas_dasum_spec. +Proof. +start_function. +forward. (* r = 0.0 *) +forward. (* ix = 0 *) +(* if (incX <= 0) return 0; -- dead, incX = 1 *) +forward_if. +{ rep_lia. } (* then-branch: incX <= 0 contradicts incX = 1 *) +(* accumulation loop *) +forward_for_simple_bound (Zlength X) + (EX k:Z, + PROP () + LOCAL (temp _r (Vfloat (asum_model (sublist 0 k X))); + temp _ix (Vint (Int.repr k)); + temp _N (Vint (Int.repr (Zlength X))); + temp _incX (Vint (Int.repr 1)); + temp _X px) + SEP (data_at shX (tarray tdouble (Zlength X)) (map Vfloat X) px))%assert. +- (* entry (k = 0): asum_model nil = +0.0 *) + entailer!!. +- (* body: t = X[ix]; r += fabs(t); ix += incX *) + forward. (* t'2 = X[ix] *) + forward_call (Znth i X). (* t'1 = fabs(X[i]) = BABS (Znth i X) *) + forward. (* r = r + t'1 *) + forward. (* ix = ix + incX *) + entailer!!. + rewrite (asum_model_step X i) by list_solve. + reflexivity. +- (* exit (k = Zlength X): return r = asum_model X *) + forward. (* return r *) + Exists (asum_model (sublist 0 (Zlength X) X)). + entailer!!. + rewrite sublist_same by lia. + apply asum_model_feq_sumF. +Qed. + +(** Payoff (stub): because the result is [feq]-equal to [sumF (map BABS X)], the + forward-error bound [LAProof.accuracy_proofs.sum_acc.fSUM] (applied to the + list [map BABS X]) bounds the C result's error whenever that result is finite + (the no-overflow hypothesis [Hfin]; see [C/cblas/README.md]). *) diff --git a/Makefile b/Makefile index 87f7aa3..07a1c4c 100644 --- a/Makefile +++ b/Makefile @@ -20,7 +20,7 @@ clean: if [ -e Makefile.coq ]; then $(MAKE) -f Makefile.coq cleanall; fi $(RM) Makefile.coq Makefile.coq.conf $(RM) C/sparse.v C/cholesky.v C/alloc.v C/densemat.v C/bandmat.v - $(RM) C/cblas/ddot.v + $(RM) C/cblas/ddot.v C/cblas/dasum.v INSTALLFILES1 ?= $(shell awk '/accuracy_proofs/{print $$NF"o"}' _CoqProject) diff --git a/_CoqProject b/_CoqProject index 86abee6..87393e6 100644 --- a/_CoqProject +++ b/_CoqProject @@ -52,3 +52,8 @@ C/cblas/ddot.v C/cblas/ddot_model.v C/cblas/spec_ddot.v C/cblas/verif_ddot.v + +C/cblas/dasum.v +C/cblas/asum_model.v +C/cblas/spec_dasum.v +C/cblas/verif_dasum.v From b38730706bf7063576e00c4d873b18eb5d2d994e Mon Sep 17 00:00:00 2001 From: Ariel Kellison Date: Sat, 30 May 2026 21:33:10 -0400 Subject: [PATCH 3/8] Add verif of dscal --- C/cblas/README.md | 20 +- C/cblas/dscal.v | 400 ++++++++++++++++++++++++++++++++++++ C/cblas/scal_model.v | 35 ++++ C/cblas/spec_ddot.v | 4 +- C/cblas/spec_dscal.v | 35 ++++ C/cblas/src/dscal.c | 15 ++ C/cblas/src/source_scal_r.h | 32 +++ C/cblas/verif_dscal.v | 84 ++++++++ Makefile | 2 +- _CoqProject | 5 + 10 files changed, 620 insertions(+), 12 deletions(-) create mode 100644 C/cblas/dscal.v create mode 100644 C/cblas/scal_model.v create mode 100644 C/cblas/spec_dscal.v create mode 100644 C/cblas/src/dscal.c create mode 100644 C/cblas/src/source_scal_r.h create mode 100644 C/cblas/verif_dscal.v diff --git a/C/cblas/README.md b/C/cblas/README.md index 6935b72..6e2fb3e 100644 --- a/C/cblas/README.md +++ b/C/cblas/README.md @@ -84,17 +84,21 @@ copyright and license headers. |-----------|-----------|----------|-------| | Dot product | double | `cblas_ddot` | [`src/ddot.c`](src/ddot.c), [`src/source_dot_r.h`](src/source_dot_r.h), [`include/cblas.h`](include/cblas.h), [`ddot.v`](ddot.v), [`ddot_model.v`](ddot_model.v), [`spec_ddot.v`](spec_ddot.v), [`verif_ddot.v`](verif_ddot.v) | | Sum of absolute values | double | `cblas_dasum` | [`src/dasum.c`](src/dasum.c), [`src/source_asum_r.h`](src/source_asum_r.h), [`include/cblas.h`](include/cblas.h), [`dasum.v`](dasum.v), [`asum_model.v`](asum_model.v), [`spec_dasum.v`](spec_dasum.v), [`verif_dasum.v`](verif_dasum.v) | - -**Scope limits (`cblas_ddot`, `cblas_dasum`):** -- **Unit stride only** (`incX = incY = 1`). General strides are out of scope: LAProof has - no strided/gather separation-logic predicate, and the unit-stride case is the one that - maps directly onto the list-based models. -- The postcondition is stated up to `feq` (IEEE-equality that identifies `+0.0`/`-0.0` and - NaN payloads), because the C accumulation adds the accumulator first while the models fold - the new term first, and IEEE addition is commutative only up to `feq`. +| Scalar multiply (in place) | double | `cblas_dscal` | [`src/dscal.c`](src/dscal.c), [`src/source_scal_r.h`](src/source_scal_r.h), [`include/cblas.h`](include/cblas.h), [`dscal.v`](dscal.v), [`scal_model.v`](scal_model.v), [`spec_dscal.v`](spec_dscal.v), [`verif_dscal.v`](verif_dscal.v) | + +**Scope limits (`cblas_ddot`, `cblas_dasum`, `cblas_dscal`):** +- **Unit stride only** (`incX = incY = 1`), as a deliberate first milestone. +- For the **reductions** (`ddot`, `dasum`) the postcondition is stated up to `feq` + (IEEE-equality that identifies `+0.0`/`-0.0` and NaN payloads), because the C accumulation + adds the accumulator first while the models fold the new term first, and IEEE addition is + commutative only up to `feq`. - `cblas_dasum` is verified against `sumF (map BABS X)`; its loop body calls `fabs`, discharged with VSTlib's `fabs_spec` (whose result is `BABS`). The matching accuracy theorem is `sum_acc.fSUM`. +- `cblas_dscal` is the first **in-place** routine: it overwrites the array (writable share, + `void` return) and is verified against an *exact* model `scal_model alpha X = + map (fun x => BMULT x alpha) X` (no `feq` — scaling is deterministic elementwise). The + per-element accuracy is a direct unit-roundoff bound on `BMULT`. ## Conventions diff --git a/C/cblas/dscal.v b/C/cblas/dscal.v new file mode 100644 index 0000000..a24cb57 --- /dev/null +++ b/C/cblas/dscal.v @@ -0,0 +1,400 @@ +From Coq Require Import String List ZArith. +From compcert Require Import Coqlib Integers Floats AST Ctypes Cop Clight Clightdefs. +Import Clightdefs.ClightNotations. +Local Open Scope Z_scope. +Local Open Scope string_scope. +Local Open Scope clight_scope. + +Module Info. + Definition version := "3.17". + Definition build_number := "". + Definition build_tag := "". + Definition build_branch := "". + Definition arch := "aarch64". + Definition model := "default". + Definition abi := "apple". + Definition bitsize := 64. + Definition big_endian := false. + Definition source_file := "C/cblas/src/dscal.c". + Definition normalized := true. +End Info. + +Definition _N : ident := $"N". +Definition _X : ident := $"X". +Definition ___builtin_annot : ident := $"__builtin_annot". +Definition ___builtin_annot_intval : ident := $"__builtin_annot_intval". +Definition ___builtin_bswap : ident := $"__builtin_bswap". +Definition ___builtin_bswap16 : ident := $"__builtin_bswap16". +Definition ___builtin_bswap32 : ident := $"__builtin_bswap32". +Definition ___builtin_bswap64 : ident := $"__builtin_bswap64". +Definition ___builtin_cls : ident := $"__builtin_cls". +Definition ___builtin_clsl : ident := $"__builtin_clsl". +Definition ___builtin_clsll : ident := $"__builtin_clsll". +Definition ___builtin_clz : ident := $"__builtin_clz". +Definition ___builtin_clzl : ident := $"__builtin_clzl". +Definition ___builtin_clzll : ident := $"__builtin_clzll". +Definition ___builtin_ctz : ident := $"__builtin_ctz". +Definition ___builtin_ctzl : ident := $"__builtin_ctzl". +Definition ___builtin_ctzll : ident := $"__builtin_ctzll". +Definition ___builtin_debug : ident := $"__builtin_debug". +Definition ___builtin_expect : ident := $"__builtin_expect". +Definition ___builtin_fabs : ident := $"__builtin_fabs". +Definition ___builtin_fabsf : ident := $"__builtin_fabsf". +Definition ___builtin_fmadd : ident := $"__builtin_fmadd". +Definition ___builtin_fmax : ident := $"__builtin_fmax". +Definition ___builtin_fmin : ident := $"__builtin_fmin". +Definition ___builtin_fmsub : ident := $"__builtin_fmsub". +Definition ___builtin_fnmadd : ident := $"__builtin_fnmadd". +Definition ___builtin_fnmsub : ident := $"__builtin_fnmsub". +Definition ___builtin_fsqrt : ident := $"__builtin_fsqrt". +Definition ___builtin_membar : ident := $"__builtin_membar". +Definition ___builtin_memcpy_aligned : ident := $"__builtin_memcpy_aligned". +Definition ___builtin_sel : ident := $"__builtin_sel". +Definition ___builtin_sqrt : ident := $"__builtin_sqrt". +Definition ___builtin_unreachable : ident := $"__builtin_unreachable". +Definition ___builtin_va_arg : ident := $"__builtin_va_arg". +Definition ___builtin_va_copy : ident := $"__builtin_va_copy". +Definition ___builtin_va_end : ident := $"__builtin_va_end". +Definition ___builtin_va_start : ident := $"__builtin_va_start". +Definition ___compcert_i64_dtos : ident := $"__compcert_i64_dtos". +Definition ___compcert_i64_dtou : ident := $"__compcert_i64_dtou". +Definition ___compcert_i64_sar : ident := $"__compcert_i64_sar". +Definition ___compcert_i64_sdiv : ident := $"__compcert_i64_sdiv". +Definition ___compcert_i64_shl : ident := $"__compcert_i64_shl". +Definition ___compcert_i64_shr : ident := $"__compcert_i64_shr". +Definition ___compcert_i64_smod : ident := $"__compcert_i64_smod". +Definition ___compcert_i64_smulh : ident := $"__compcert_i64_smulh". +Definition ___compcert_i64_stod : ident := $"__compcert_i64_stod". +Definition ___compcert_i64_stof : ident := $"__compcert_i64_stof". +Definition ___compcert_i64_udiv : ident := $"__compcert_i64_udiv". +Definition ___compcert_i64_umod : ident := $"__compcert_i64_umod". +Definition ___compcert_i64_umulh : ident := $"__compcert_i64_umulh". +Definition ___compcert_i64_utod : ident := $"__compcert_i64_utod". +Definition ___compcert_i64_utof : ident := $"__compcert_i64_utof". +Definition ___compcert_va_composite : ident := $"__compcert_va_composite". +Definition ___compcert_va_float64 : ident := $"__compcert_va_float64". +Definition ___compcert_va_int32 : ident := $"__compcert_va_int32". +Definition ___compcert_va_int64 : ident := $"__compcert_va_int64". +Definition _alpha : ident := $"alpha". +Definition _cblas_dscal : ident := $"cblas_dscal". +Definition _i : ident := $"i". +Definition _incX : ident := $"incX". +Definition _ix : ident := $"ix". +Definition _main : ident := $"main". +Definition _t'1 : ident := 128%positive. + +Definition f_cblas_dscal := {| + fn_return := tvoid; + fn_callconv := cc_default; + fn_params := ((_N, tint) :: (_alpha, tdouble) :: (_X, (tptr tdouble)) :: + (_incX, tint) :: nil); + fn_vars := nil; + fn_temps := ((_i, tint) :: (_ix, tint) :: (_t'1, tdouble) :: nil); + fn_body := +(Ssequence + (Sset _ix (Econst_int (Int.repr 0) tint)) + (Ssequence + (Sifthenelse (Ebinop Ole (Etempvar _incX tint) + (Econst_int (Int.repr 0) tint) tint) + (Sreturn None) + Sskip) + (Ssequence + (Sset _i (Econst_int (Int.repr 0) tint)) + (Sloop + (Ssequence + (Sifthenelse (Ebinop Olt (Etempvar _i tint) (Etempvar _N tint) + tint) + Sskip + Sbreak) + (Ssequence + (Ssequence + (Sset _t'1 + (Ederef + (Ebinop Oadd (Etempvar _X (tptr tdouble)) + (Etempvar _ix tint) (tptr tdouble)) tdouble)) + (Sassign + (Ederef + (Ebinop Oadd (Etempvar _X (tptr tdouble)) + (Etempvar _ix tint) (tptr tdouble)) tdouble) + (Ebinop Omul (Etempvar _t'1 tdouble) + (Etempvar _alpha tdouble) tdouble))) + (Sset _ix + (Ebinop Oadd (Etempvar _ix tint) (Etempvar _incX tint) tint)))) + (Sset _i + (Ebinop Oadd (Etempvar _i tint) (Econst_int (Int.repr 1) tint) + tint)))))) +|}. + +Definition composites : list composite_definition := +nil. + +Definition global_definitions : list (ident * globdef fundef type) := +((___compcert_va_int32, + Gfun(External (EF_runtime "__compcert_va_int32" + (mksignature (AST.Xptr :: nil) AST.Xint cc_default)) + ((tptr tvoid) :: nil) tuint cc_default)) :: + (___compcert_va_int64, + Gfun(External (EF_runtime "__compcert_va_int64" + (mksignature (AST.Xptr :: nil) AST.Xlong cc_default)) + ((tptr tvoid) :: nil) tulong cc_default)) :: + (___compcert_va_float64, + Gfun(External (EF_runtime "__compcert_va_float64" + (mksignature (AST.Xptr :: nil) AST.Xfloat cc_default)) + ((tptr tvoid) :: nil) tdouble cc_default)) :: + (___compcert_va_composite, + Gfun(External (EF_runtime "__compcert_va_composite" + (mksignature (AST.Xptr :: AST.Xlong :: nil) AST.Xptr + cc_default)) ((tptr tvoid) :: tulong :: nil) + (tptr tvoid) cc_default)) :: + (___compcert_i64_dtos, + Gfun(External (EF_runtime "__compcert_i64_dtos" + (mksignature (AST.Xfloat :: nil) AST.Xlong cc_default)) + (tdouble :: nil) tlong cc_default)) :: + (___compcert_i64_dtou, + Gfun(External (EF_runtime "__compcert_i64_dtou" + (mksignature (AST.Xfloat :: nil) AST.Xlong cc_default)) + (tdouble :: nil) tulong cc_default)) :: + (___compcert_i64_stod, + Gfun(External (EF_runtime "__compcert_i64_stod" + (mksignature (AST.Xlong :: nil) AST.Xfloat cc_default)) + (tlong :: nil) tdouble cc_default)) :: + (___compcert_i64_utod, + Gfun(External (EF_runtime "__compcert_i64_utod" + (mksignature (AST.Xlong :: nil) AST.Xfloat cc_default)) + (tulong :: nil) tdouble cc_default)) :: + (___compcert_i64_stof, + Gfun(External (EF_runtime "__compcert_i64_stof" + (mksignature (AST.Xlong :: nil) AST.Xsingle cc_default)) + (tlong :: nil) tfloat cc_default)) :: + (___compcert_i64_utof, + Gfun(External (EF_runtime "__compcert_i64_utof" + (mksignature (AST.Xlong :: nil) AST.Xsingle cc_default)) + (tulong :: nil) tfloat cc_default)) :: + (___compcert_i64_sdiv, + Gfun(External (EF_runtime "__compcert_i64_sdiv" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___compcert_i64_udiv, + Gfun(External (EF_runtime "__compcert_i64_udiv" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tulong :: tulong :: nil) tulong + cc_default)) :: + (___compcert_i64_smod, + Gfun(External (EF_runtime "__compcert_i64_smod" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___compcert_i64_umod, + Gfun(External (EF_runtime "__compcert_i64_umod" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tulong :: tulong :: nil) tulong + cc_default)) :: + (___compcert_i64_shl, + Gfun(External (EF_runtime "__compcert_i64_shl" + (mksignature (AST.Xlong :: AST.Xint :: nil) AST.Xlong + cc_default)) (tlong :: tint :: nil) tlong cc_default)) :: + (___compcert_i64_shr, + Gfun(External (EF_runtime "__compcert_i64_shr" + (mksignature (AST.Xlong :: AST.Xint :: nil) AST.Xlong + cc_default)) (tulong :: tint :: nil) tulong cc_default)) :: + (___compcert_i64_sar, + Gfun(External (EF_runtime "__compcert_i64_sar" + (mksignature (AST.Xlong :: AST.Xint :: nil) AST.Xlong + cc_default)) (tlong :: tint :: nil) tlong cc_default)) :: + (___compcert_i64_smulh, + Gfun(External (EF_runtime "__compcert_i64_smulh" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___compcert_i64_umulh, + Gfun(External (EF_runtime "__compcert_i64_umulh" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tulong :: tulong :: nil) tulong + cc_default)) :: + (___builtin_bswap64, + Gfun(External (EF_builtin "__builtin_bswap64" + (mksignature (AST.Xlong :: nil) AST.Xlong cc_default)) + (tulong :: nil) tulong cc_default)) :: + (___builtin_bswap, + Gfun(External (EF_builtin "__builtin_bswap" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tuint cc_default)) :: + (___builtin_bswap32, + Gfun(External (EF_builtin "__builtin_bswap32" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tuint cc_default)) :: + (___builtin_bswap16, + Gfun(External (EF_builtin "__builtin_bswap16" + (mksignature (AST.Xint16unsigned :: nil) + AST.Xint16unsigned cc_default)) (tushort :: nil) tushort + cc_default)) :: + (___builtin_clz, + Gfun(External (EF_builtin "__builtin_clz" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tint cc_default)) :: + (___builtin_clzl, + Gfun(External (EF_builtin "__builtin_clzl" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_clzll, + Gfun(External (EF_builtin "__builtin_clzll" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_ctz, + Gfun(External (EF_builtin "__builtin_ctz" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tuint :: nil) tint cc_default)) :: + (___builtin_ctzl, + Gfun(External (EF_builtin "__builtin_ctzl" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_ctzll, + Gfun(External (EF_builtin "__builtin_ctzll" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tulong :: nil) tint cc_default)) :: + (___builtin_fabs, + Gfun(External (EF_builtin "__builtin_fabs" + (mksignature (AST.Xfloat :: nil) AST.Xfloat cc_default)) + (tdouble :: nil) tdouble cc_default)) :: + (___builtin_fabsf, + Gfun(External (EF_builtin "__builtin_fabsf" + (mksignature (AST.Xsingle :: nil) AST.Xsingle cc_default)) + (tfloat :: nil) tfloat cc_default)) :: + (___builtin_fsqrt, + Gfun(External (EF_builtin "__builtin_fsqrt" + (mksignature (AST.Xfloat :: nil) AST.Xfloat cc_default)) + (tdouble :: nil) tdouble cc_default)) :: + (___builtin_sqrt, + Gfun(External (EF_builtin "__builtin_sqrt" + (mksignature (AST.Xfloat :: nil) AST.Xfloat cc_default)) + (tdouble :: nil) tdouble cc_default)) :: + (___builtin_memcpy_aligned, + Gfun(External (EF_builtin "__builtin_memcpy_aligned" + (mksignature + (AST.Xptr :: AST.Xptr :: AST.Xlong :: AST.Xlong :: nil) + AST.Xvoid cc_default)) + ((tptr tvoid) :: (tptr tvoid) :: tulong :: tulong :: nil) tvoid + cc_default)) :: + (___builtin_sel, + Gfun(External (EF_builtin "__builtin_sel" + (mksignature (AST.Xbool :: nil) AST.Xvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) + (tbool :: nil) tvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) :: + (___builtin_annot, + Gfun(External (EF_builtin "__builtin_annot" + (mksignature (AST.Xptr :: nil) AST.Xvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) + ((tptr tschar) :: nil) tvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) :: + (___builtin_annot_intval, + Gfun(External (EF_builtin "__builtin_annot_intval" + (mksignature (AST.Xptr :: AST.Xint :: nil) AST.Xint + cc_default)) ((tptr tschar) :: tint :: nil) tint + cc_default)) :: + (___builtin_membar, + Gfun(External (EF_builtin "__builtin_membar" + (mksignature nil AST.Xvoid cc_default)) nil tvoid + cc_default)) :: + (___builtin_va_start, + Gfun(External (EF_builtin "__builtin_va_start" + (mksignature (AST.Xptr :: nil) AST.Xvoid cc_default)) + ((tptr tvoid) :: nil) tvoid cc_default)) :: + (___builtin_va_arg, + Gfun(External (EF_builtin "__builtin_va_arg" + (mksignature (AST.Xptr :: AST.Xint :: nil) AST.Xvoid + cc_default)) ((tptr tvoid) :: tuint :: nil) tvoid + cc_default)) :: + (___builtin_va_copy, + Gfun(External (EF_builtin "__builtin_va_copy" + (mksignature (AST.Xptr :: AST.Xptr :: nil) AST.Xvoid + cc_default)) ((tptr tvoid) :: (tptr tvoid) :: nil) tvoid + cc_default)) :: + (___builtin_va_end, + Gfun(External (EF_builtin "__builtin_va_end" + (mksignature (AST.Xptr :: nil) AST.Xvoid cc_default)) + ((tptr tvoid) :: nil) tvoid cc_default)) :: + (___builtin_unreachable, + Gfun(External (EF_builtin "__builtin_unreachable" + (mksignature nil AST.Xvoid cc_default)) nil tvoid + cc_default)) :: + (___builtin_expect, + Gfun(External (EF_builtin "__builtin_expect" + (mksignature (AST.Xlong :: AST.Xlong :: nil) AST.Xlong + cc_default)) (tlong :: tlong :: nil) tlong cc_default)) :: + (___builtin_cls, + Gfun(External (EF_builtin "__builtin_cls" + (mksignature (AST.Xint :: nil) AST.Xint cc_default)) + (tint :: nil) tint cc_default)) :: + (___builtin_clsl, + Gfun(External (EF_builtin "__builtin_clsl" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tlong :: nil) tint cc_default)) :: + (___builtin_clsll, + Gfun(External (EF_builtin "__builtin_clsll" + (mksignature (AST.Xlong :: nil) AST.Xint cc_default)) + (tlong :: nil) tint cc_default)) :: + (___builtin_fmadd, + Gfun(External (EF_builtin "__builtin_fmadd" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fmsub, + Gfun(External (EF_builtin "__builtin_fmsub" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fnmadd, + Gfun(External (EF_builtin "__builtin_fnmadd" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fnmsub, + Gfun(External (EF_builtin "__builtin_fnmsub" + (mksignature + (AST.Xfloat :: AST.Xfloat :: AST.Xfloat :: nil) + AST.Xfloat cc_default)) + (tdouble :: tdouble :: tdouble :: nil) tdouble cc_default)) :: + (___builtin_fmax, + Gfun(External (EF_builtin "__builtin_fmax" + (mksignature (AST.Xfloat :: AST.Xfloat :: nil) AST.Xfloat + cc_default)) (tdouble :: tdouble :: nil) tdouble + cc_default)) :: + (___builtin_fmin, + Gfun(External (EF_builtin "__builtin_fmin" + (mksignature (AST.Xfloat :: AST.Xfloat :: nil) AST.Xfloat + cc_default)) (tdouble :: tdouble :: nil) tdouble + cc_default)) :: + (___builtin_debug, + Gfun(External (EF_external "__builtin_debug" + (mksignature (AST.Xint :: nil) AST.Xvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) + (tint :: nil) tvoid + {|cc_vararg:=(Some 1); cc_unproto:=false; cc_structret:=false|})) :: + (_cblas_dscal, Gfun(Internal f_cblas_dscal)) :: nil). + +Definition public_idents : list ident := +(_cblas_dscal :: ___builtin_debug :: ___builtin_fmin :: ___builtin_fmax :: + ___builtin_fnmsub :: ___builtin_fnmadd :: ___builtin_fmsub :: + ___builtin_fmadd :: ___builtin_clsll :: ___builtin_clsl :: ___builtin_cls :: + ___builtin_expect :: ___builtin_unreachable :: ___builtin_va_end :: + ___builtin_va_copy :: ___builtin_va_arg :: ___builtin_va_start :: + ___builtin_membar :: ___builtin_annot_intval :: ___builtin_annot :: + ___builtin_sel :: ___builtin_memcpy_aligned :: ___builtin_sqrt :: + ___builtin_fsqrt :: ___builtin_fabsf :: ___builtin_fabs :: + ___builtin_ctzll :: ___builtin_ctzl :: ___builtin_ctz :: ___builtin_clzll :: + ___builtin_clzl :: ___builtin_clz :: ___builtin_bswap16 :: + ___builtin_bswap32 :: ___builtin_bswap :: ___builtin_bswap64 :: + ___compcert_i64_umulh :: ___compcert_i64_smulh :: ___compcert_i64_sar :: + ___compcert_i64_shr :: ___compcert_i64_shl :: ___compcert_i64_umod :: + ___compcert_i64_smod :: ___compcert_i64_udiv :: ___compcert_i64_sdiv :: + ___compcert_i64_utof :: ___compcert_i64_stof :: ___compcert_i64_utod :: + ___compcert_i64_stod :: ___compcert_i64_dtou :: ___compcert_i64_dtos :: + ___compcert_va_composite :: ___compcert_va_float64 :: + ___compcert_va_int64 :: ___compcert_va_int32 :: nil). + +Definition prog : Clight.program := + mkprogram composites global_definitions public_idents _main Logic.I. + + diff --git a/C/cblas/scal_model.v b/C/cblas/scal_model.v new file mode 100644 index 0000000..9d7df4b --- /dev/null +++ b/C/cblas/scal_model.v @@ -0,0 +1,35 @@ +(** * LAProof.C.cblas.scal_model: functional model of GSL's [cblas_dscal]. *) +(** ** Corresponds to C program [C/cblas/src/dscal.c] (ported from GSL cblas). *) + +(** Unlike the reduction routines ([ddot]/[dasum]), scaling is a deterministic + *elementwise* update with no accumulation, so the C computes the model + exactly (no [feq] needed). GSL's kernel ([source_scal_r.h]) is +<< + for (i = 0; i < N; i++) { X[ix] *= alpha; ix += incX; } +>> + and [X[ix] *= alpha] is [BMULT (X[ix]) alpha] (element-first), so the model + is the C-faithful list map below. + + LAProof's existing scale model [mv_mathcomp.scalemx] is alpha-first + ([map_mx (BMULT a)]) and matrix-based; relating to it would need a [BMULT] + commutativity lemma (not in LAProof) plus a vector/matrix reshape. That is + unnecessary here: the per-element accuracy (relative error <= unit roundoff) + is exactly vcfloat's correctly-rounded-[BMULT] bound. *) + +Require Import VST.floyd.proofauto. +From vcfloat Require Import FPStdCompCert FPStdLib. +From LAProof.C Require Import floatlib. + +Set Bullet Behavior "Strict Subproofs". + +Definition scal_model (alpha: ftype Tdouble) (X: list (ftype Tdouble)) : list (ftype Tdouble) := + map (fun x => BMULT x alpha) X. + +Lemma Zlength_scal_model: forall alpha X, + Zlength (scal_model alpha X) = Zlength X. +Proof. intros. unfold scal_model. rewrite Zlength_map. reflexivity. Qed. + +Lemma Znth_scal_model: forall alpha X k, + 0 <= k < Zlength X -> + Znth k (scal_model alpha X) = BMULT (Znth k X) alpha. +Proof. intros. unfold scal_model. rewrite Znth_map by lia. reflexivity. Qed. diff --git a/C/cblas/spec_ddot.v b/C/cblas/spec_ddot.v index bfa1e5e..c3ace94 100644 --- a/C/cblas/spec_ddot.v +++ b/C/cblas/spec_ddot.v @@ -42,9 +42,7 @@ Definition Vprog : varspecs. mk_varspecs prog. Defined. [spec_sparse] split). This file states only the VST [funspec]. *) (** ** Funspec for [cblas_ddot] (unit-stride case [incX = incY = 1]). *) -(** General strides are out of scope: LAProof has no strided/gather - separation-logic predicate, and the unit-stride case is the one that maps - directly onto the list-based [dotprodF] model. *) +(** Unit stride is a deliberate first milestone, not a fundamental limit. *) Definition cblas_ddot_spec := DECLARE _cblas_ddot WITH shX: share, shY: share, n: Z, diff --git a/C/cblas/spec_dscal.v b/C/cblas/spec_dscal.v new file mode 100644 index 0000000..c2db868 --- /dev/null +++ b/C/cblas/spec_dscal.v @@ -0,0 +1,35 @@ +(** * LAProof.C.cblas.spec_dscal: VST specification of GSL's [cblas_dscal]. *) +(** ** Corresponds to C program [C/cblas/src/dscal.c] (ported from GSL cblas). *) + +(** [cblas_dscal] scales a vector in place by [alpha]. Because scaling is + deterministic elementwise, the postcondition is stated as an *exact* equality + (the array is left holding [scal_model alpha X]); no [feq]/[EX] is needed, + unlike the reduction routines [ddot]/[dasum]. *) + +Require Import VST.floyd.proofauto. +From vcfloat Require Import FPStdCompCert FPStdLib. +From LAProof.C Require Import floatlib. +From LAProof.C.cblas Require Import dscal scal_model. + +Set Bullet Behavior "Strict Subproofs". + +#[export] Instance CompSpecs : compspecs. make_compspecs prog. Defined. +Definition Vprog : varspecs. mk_varspecs prog. Defined. + +(** ** Funspec for [cblas_dscal] (unit-stride case [incX = 1]). *) +(** Unit stride is a first milestone, not a fundamental limit.*) +Definition cblas_dscal_spec := + DECLARE _cblas_dscal + WITH sh: share, n: Z, alpha: ftype Tdouble, X: list (ftype Tdouble), px: val + PRE [ tint, tdouble, tptr tdouble, tint ] + PROP (writable_share sh; Zlength X = n; 0 <= n <= Int.max_signed) + PARAMS (Vint (Int.repr n); Vfloat alpha; px; Vint (Int.repr 1)) + SEP (data_at sh (tarray tdouble n) (map Vfloat X) px) + POST [ tvoid ] + PROP () + RETURN () + SEP (data_at sh (tarray tdouble n) (map Vfloat (scal_model alpha X)) px). + +Definition DscalASI : funspecs := [ cblas_dscal_spec ]. +Definition dscal_imports : funspecs := []. (* no external calls *) +Definition Gprog : funspecs := dscal_imports ++ DscalASI. diff --git a/C/cblas/src/dscal.c b/C/cblas/src/dscal.c new file mode 100644 index 0000000..242ad30 --- /dev/null +++ b/C/cblas/src/dscal.c @@ -0,0 +1,15 @@ +/* The following GSL headers are commented out for clightgen: they are + unresolvable here (GSL's gsl/ symlink dir is not generated) and contribute + nothing to cblas_dscal's body (only INDEX from "cblas.h" is needed; the loop + uses no libc math). */ +/* #include */ +/* #include */ +#include "cblas.h" + +void +cblas_dscal (const int N, const double alpha, double *X, const int incX) +{ +#define BASE double +#include "source_scal_r.h" +#undef BASE +} diff --git a/C/cblas/src/source_scal_r.h b/C/cblas/src/source_scal_r.h new file mode 100644 index 0000000..81d2300 --- /dev/null +++ b/C/cblas/src/source_scal_r.h @@ -0,0 +1,32 @@ +/* blas/source_scal_r.h + * + * Copyright (C) 1996, 1997, 1998, 1999, 2000 Gerard Jungman + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or (at + * your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +{ + INDEX i; + INDEX ix = 0; + + if (incX <= 0) { + return; + } + + for (i = 0; i < N; i++) { + X[ix] *= alpha; + ix += incX; + } +} diff --git a/C/cblas/verif_dscal.v b/C/cblas/verif_dscal.v new file mode 100644 index 0000000..262f313 --- /dev/null +++ b/C/cblas/verif_dscal.v @@ -0,0 +1,84 @@ +(** * LAProof.C.cblas.verif_dscal: VST proof for GSL's [cblas_dscal]. *) +(** ** Corresponds to C program [C/cblas/src/dscal.c]. *) + +(** [body_cblas_dscal] proves that after the compiled C [cblas_dscal] (unit + stride) the array holds exactly [scal_model alpha X] (= [map (fun x => BMULT + x alpha) X]). First in-place routine in this directory: a writable array + with a store in the loop, whose invariant is "scaled prefix ++ original + suffix"; the per-step [upd_Znth]/[sublist] identity is discharged by + [list_solve]. *) + +Require Import VST.floyd.proofauto. +From vcfloat Require Import FPStdCompCert FPStdLib. +From LAProof.C Require Import floatlib. +From LAProof.C.cblas Require Import dscal scal_model spec_dscal. + +Set Bullet Behavior "Strict Subproofs". + +Lemma body_cblas_dscal: semax_body Vprog Gprog f_cblas_dscal cblas_dscal_spec. +Proof. +start_function. +forward. (* ix = 0 *) +forward_if. +{ rep_lia. } (* dead: incX <= 0 contradicts incX = 1 *) +(* Invariant: the array holds a list [L] of length [Zlength X] whose first [k] + entries are already scaled ([Znth j X * alpha]) and whose remaining entries + are still the original [X]. Keeping the contents as [map Vfloat L] for a + *plain* list [L] lets the load and store simplify; the pointwise [Znth] + constraints make each step a simple [upd_Znth_same]/[upd_Znth_diff] case. *) +forward_for_simple_bound (Zlength X) + (EX k:Z, EX L:list (ftype Tdouble), + PROP (Zlength L = Zlength X; + forall j, 0 <= j < k -> Znth j L = (Znth j X * alpha)%F64; + forall j, k <= j < Zlength X -> Znth j L = Znth j X) + LOCAL (temp _alpha (Vfloat alpha); + temp _ix (Vint (Int.repr k)); + temp _N (Vint (Int.repr (Zlength X))); + temp _incX (Vint (Int.repr 1)); + temp _X px) + SEP (data_at sh (tarray tdouble (Zlength X)) (map Vfloat L) px))%assert. +- (* entry (k = 0): L := X *) + Exists X. entailer!!. +- (* body: t = X[ix]; X[ix] = t * alpha; ix += incX *) + Intros. + (* Re-assert the invariant facts under explicit names so they survive the + [forward]s (which clear the auto-named PROP hypotheses). *) + assert (Hlen: Zlength L = Zlength X) by assumption. + assert (Hpre: forall j, 0 <= j < i -> Znth j L = (Znth j X * alpha)%F64) by assumption. + assert (Hsuf: forall j, i <= j < Zlength X -> Znth j L = Znth j X) by assumption. + assert (Hi: 0 <= i < Zlength L) by lia. + assert (HLX: Znth i L = Znth i X) by (apply Hsuf; lia). + forward. (* t'1 = X[ix] *) + forward. (* X[ix] = t'1 * alpha *) + forward. (* ix += incX *) + Exists (upd_Znth i L (BMULT (Znth i X) alpha)). + entailer!!. + 2: { (* SEP: the stored value [X[i]*alpha] is the model element [BMULT .. alpha] *) + apply derives_refl'. f_equal. rewrite <- upd_Znth_map. f_equal. + rewrite ?Znth_map by lia. + replace (@Znth float Inhabitant_float i L) + with (@Znth float Inhabitant_float i X) by (symmetry; apply HLX). + reflexivity. } + (* the three pointwise list invariants for the updated list *) + split3. + { list_solve. } + { intros j Hj. destruct (Z.eq_dec j i) as [->|Hne]. + { rewrite upd_Znth_same by lia. reflexivity. } + { rewrite upd_Znth_diff by lia. apply Hpre; lia. } } + { intros j Hj. rewrite upd_Znth_diff by lia. apply Hsuf; lia. } +- (* exit (k = Zlength X): the whole array now holds [scal_model alpha X] *) + Intros L. + assert (Hpre: forall j, 0 <= j < Zlength X -> Znth j L = (Znth j X * alpha)%F64) + by assumption. + assert (Hlen: Zlength L = Zlength X) by assumption. + assert (L = scal_model alpha X). { + apply Znth_eq_ext; [ rewrite Zlength_scal_model; lia | ]. + intros j Hj. rewrite Hlen in Hj. + rewrite Znth_scal_model by lia. rewrite Hpre by lia. reflexivity. } + subst L. entailer!!. +Qed. + +(** Payoff: each element [BMULT (Znth k X) alpha] is the correctly-rounded + product, so its relative error vs the exact product is bounded by the unit + roundoff (vcfloat's [BMULT] accuracy lemma); there is no accumulation, hence + no global/no-overflow hypothesis as in the reduction routines. *) diff --git a/Makefile b/Makefile index 07a1c4c..99355c4 100644 --- a/Makefile +++ b/Makefile @@ -20,7 +20,7 @@ clean: if [ -e Makefile.coq ]; then $(MAKE) -f Makefile.coq cleanall; fi $(RM) Makefile.coq Makefile.coq.conf $(RM) C/sparse.v C/cholesky.v C/alloc.v C/densemat.v C/bandmat.v - $(RM) C/cblas/ddot.v C/cblas/dasum.v + $(RM) C/cblas/ddot.v C/cblas/dasum.v C/cblas/dscal.v INSTALLFILES1 ?= $(shell awk '/accuracy_proofs/{print $$NF"o"}' _CoqProject) diff --git a/_CoqProject b/_CoqProject index 87393e6..f5e4fb2 100644 --- a/_CoqProject +++ b/_CoqProject @@ -57,3 +57,8 @@ C/cblas/dasum.v C/cblas/asum_model.v C/cblas/spec_dasum.v C/cblas/verif_dasum.v + +C/cblas/dscal.v +C/cblas/scal_model.v +C/cblas/spec_dscal.v +C/cblas/verif_dscal.v From 9d6267d2f88feb5328563c206d8a4b9859ff4c25 Mon Sep 17 00:00:00 2001 From: Ariel Kellison Date: Sun, 31 May 2026 08:56:35 -0400 Subject: [PATCH 4/8] Nits: model docstrings --- C/cblas/asum_model.v | 26 +++++++++++++------------- C/cblas/ddot_model.v | 44 +++++++++++++++++++++----------------------- C/cblas/scal_model.v | 12 +++++++----- 3 files changed, 41 insertions(+), 41 deletions(-) diff --git a/C/cblas/asum_model.v b/C/cblas/asum_model.v index fefb051..163f505 100644 --- a/C/cblas/asum_model.v +++ b/C/cblas/asum_model.v @@ -1,22 +1,22 @@ (** * LAProof.C.cblas.asum_model: functional model of GSL's [cblas_dasum]. *) (** ** Corresponds to C program [C/cblas/src/dasum.c] (ported from GSL cblas). *) -(** Sibling of [LAProof.C.cblas.ddot_model]: the partial-accumulation *model - function* the [cblas_dasum] loop invariant tracks, with the *start* / *step* - / *end* lemmas about it. GSL's kernel ([source_asum_r.h]) is +(** This file provides [asum_model], the partial accumulation that the + [cblas_dasum] loop invariant tracks, with the *step* lemma (how one loop + iteration extends the accumulation) and the *end* lemma (the result of the + accumulation once the loop finishes). GSL's kernel ([source_asum_r.h]) is << double r = 0.0; for (i = 0; i < N; i++) { r += fabs(X[ix]); ix += incX; } return r; >> - a forward accumulation of [|X[i]|] with separate abs-then-add, accumulator - first ([BPLUS acc |x|]), starting from [+0.0]. The C [fabs] call computes - [BABS] (the result of VSTlib's [fabs_spec], whose [ff_func] is [BABS]), so the - model uses [BABS]. + i.e., it adds [|X[i]|] left to right into the accumulator ([BPLUS acc |x|]), + starting from [+0.0]. The C [fabs] computes [BABS] (the result of VSTlib's + [fabs_spec], whose [ff_func] is [BABS]), so the model uses [BABS]. - The matching LAProof accuracy model is [sum_model.sumF] applied to the list + The matching LAProof accuracy model is [sum_model.sumF] applied to [map BABS X] (the absolute values); over it [sum_acc.fSUM] proves the - forward-error bound. As with [ddot], the C's operand order ([BPLUS acc x] + forward-error bound. The C's operand order ([BPLUS acc x] vs [sumF]'s [BPLUS x acc]) and initial accumulator ([+0.0] = [Zconst Tdouble 0] vs [sumF]'s [neg_zero]) differ, so the end/bridge lemma relates the two by [feq] (IEEE-equality identifying [±0]/NaN). *) @@ -37,7 +37,7 @@ Set Bullet Behavior "Strict Subproofs". (** [asum_model X] is the value the C loop *literally* computes: a left fold with the accumulator as the first [BPLUS] operand, separate abs-then-add, starting - from [+0.0]. *) + from [+0.0]. This mirrors the Clight AST in [dasum.v] exactly. *) Definition asum_loop (l: list (ftype Tdouble)) : ftype Tdouble := fold_left (fun acc x => BPLUS acc (BABS x)) l (Zconst Tdouble 0). @@ -49,9 +49,9 @@ Lemma asum_loop_snoc: forall l x, asum_loop (l ++ [x]) = BPLUS (asum_loop l) (BABS x). Proof. intros. unfold asum_loop. rewrite fold_left_app. reflexivity. Qed. -(** *step*: extending the length-[k] prefix by its [k]-th element adds one - [BABS] term, with the accumulator as the first [BPLUS] operand -- exactly the - Clight statement [r = r + fabs(X[k])]. *) +(** *step*: extending the length-[k] prefix [X[0..k-1]] with the element [X[k]] + adds one [BABS] term, with the accumulator as the first [BPLUS] operand -- + exactly the Clight statement [r = r + fabs(X[k])]. *) Lemma asum_model_step: forall (X: list (ftype Tdouble)) k, 0 <= k < Zlength X -> asum_model (sublist 0 (k+1) X) diff --git a/C/cblas/ddot_model.v b/C/cblas/ddot_model.v index 4207787..ad41b49 100644 --- a/C/cblas/ddot_model.v +++ b/C/cblas/ddot_model.v @@ -1,30 +1,29 @@ (** * LAProof.C.cblas.ddot_model: functional model of GSL's [cblas_ddot]. *) (** ** Corresponds to C program [C/cblas/src/ddot.c] (ported from GSL cblas). *) -(** This file plays the same role for [cblas_ddot] that [LAProof.C.sparse_model] - plays for the sparse routines: it defines the partial-accumulation *model - function* the loop invariant tracks, together with the *start* / *step* / - *end* lemmas about it (cf. [partial_row]/[partial_row_start]/ - [partial_row_next]/[partial_row_end]). No VST [funspec]s or Clight ASTs - appear here -- those live in [spec_ddot] and [verif_ddot]. - - GSL's loop (after macro expansion of [source_dot_r.h]) is +(** This file provides [ddot_model], the partial accumulation that the + [cblas_ddot] loop invariant tracks, with the *step* lemma (how one loop + iteration extends the accumulation) and the *end* lemma (the result of the + accumulation once the loop finishes). No VST [funspec]s or Clight ASTs + appear here -- those live in [spec_ddot] and [verif_ddot]. GSL's loop + (after macro expansion of [source_dot_r.h]) is << double r = 0.0; for (i = 0; i < N; i++) { r += X[ix] * Y[iy]; ix += incX; iy += incY; } return r; >> - i.e. a forward, left-to-right accumulation using *separate* multiply-then-add - (not fused multiply-add), starting from +0.0. That is exactly the [dotprodF] - model of [LAProof.accuracy_proofs.dotprod_model] - ([dotprodF = dotprod BMULT BPLUS pos_zero]), over which the forward-error - theorem [dot_acc.dotprod_forward_error] is proved -- *not* the FMA-based - [floatlib.dotprod]. + i.e., it adds [X[i]*Y[i]] left to right into the accumulator + ([BPLUS acc prod]), using separate multiply-then-add (not fused + multiply-add), starting from [+0.0]. - Note on operand order: the C statement [r += X*Y] computes [BPLUS acc prod] - (accumulator first), whereas [dotprodF]'s fold step is [BPLUS prod acc] - (product first). IEEE addition is commutative *up to [feq]*, hence the - end/bridge lemma below relates the two by [feq]. *) + The matching LAProof accuracy model is [dotprodF] + ([dotprodF = dotprod BMULT BPLUS pos_zero]) of + [LAProof.accuracy_proofs.dotprod_model]; over it + [dot_acc.dotprod_forward_error] proves the forward-error bound -- *not* the + FMA-based [floatlib.dotprod]. The C's operand order ([BPLUS acc prod] vs + [dotprodF]'s [BPLUS prod acc]) and initial accumulator ([+0.0] = + [Zconst Tdouble 0] vs [dotprodF]'s [pos_zero]) differ, so the end/bridge + lemma relates the two by [feq] (IEEE addition is commutative up to [feq]). *) Require Import VST.floyd.proofauto. Require Import vcfloat.VCFloat. @@ -43,8 +42,7 @@ Set Bullet Behavior "Strict Subproofs". (** [ddot_model X Y] is the value the C loop *literally* computes: a left fold with the accumulator as the first [BPLUS] operand, separate multiply then - add, starting from [+0.0]. This mirrors the Clight AST in [ddot.v] exactly, - which makes it the natural value to carry in the loop invariant. *) + add, starting from [+0.0]. This mirrors the Clight AST in [ddot.v] exactly. *) Definition ddot_loop (xy: list (ftype Tdouble * ftype Tdouble)) : ftype Tdouble := fold_left (fun acc p => BPLUS acc (BMULT (fst p) (snd p))) xy (Zconst Tdouble 0). @@ -65,9 +63,9 @@ Lemma ddot_loop_snoc: forall xy p, ddot_loop (xy ++ [p]) = BPLUS (ddot_loop xy) (BMULT (fst p) (snd p)). Proof. intros. unfold ddot_loop. rewrite fold_left_app. reflexivity. Qed. -(** *step*: extending both length-[k] prefixes by their [k]-th element adds one - [BMULT] term, with the accumulator as the first [BPLUS] operand -- exactly - the Clight statement [r = r + X[k]*Y[k]]. (cf. [partial_row_next]) *) +(** *step*: extending both length-[k] prefixes [X[0..k-1]]/[Y[0..k-1]] with the + elements [X[k]]/[Y[k]] adds one [BMULT] term, with the accumulator as the + first [BPLUS] operand -- exactly the Clight statement [r = r + X[k]*Y[k]]. *) Lemma ddot_model_step: forall (X Y: list (ftype Tdouble)) k, Zlength X = Zlength Y -> 0 <= k < Zlength X -> ddot_model (sublist 0 (k+1) X) (sublist 0 (k+1) Y) diff --git a/C/cblas/scal_model.v b/C/cblas/scal_model.v index 9d7df4b..c8ea503 100644 --- a/C/cblas/scal_model.v +++ b/C/cblas/scal_model.v @@ -1,14 +1,16 @@ (** * LAProof.C.cblas.scal_model: functional model of GSL's [cblas_dscal]. *) (** ** Corresponds to C program [C/cblas/src/dscal.c] (ported from GSL cblas). *) -(** Unlike the reduction routines ([ddot]/[dasum]), scaling is a deterministic - *elementwise* update with no accumulation, so the C computes the model - exactly (no [feq] needed). GSL's kernel ([source_scal_r.h]) is +(** This file provides [scal_model], the elementwise scaling that [cblas_dscal] + computes. Scaling is a deterministic *elementwise* update with no + accumulation, so there is no loop invariant tracking a running accumulator + and no [feq] bridge -- the C program computes the model exactly. GSL's + kernel ([source_scal_r.h]) is << for (i = 0; i < N; i++) { X[ix] *= alpha; ix += incX; } >> - and [X[ix] *= alpha] is [BMULT (X[ix]) alpha] (element-first), so the model - is the C-faithful list map below. + i.e., [X[ix] *= alpha] is [BMULT (X[ix]) alpha] (element-first), so the model + is the list map below, matching what the C program computes. LAProof's existing scale model [mv_mathcomp.scalemx] is alpha-first ([map_mx (BMULT a)]) and matrix-based; relating to it would need a [BMULT] From dc8a8ff22f4b3f0d2270f6e03ef5dd287bdf1346 Mon Sep 17 00:00:00 2001 From: Ariel Kellison Date: Sun, 31 May 2026 20:33:13 -0400 Subject: [PATCH 5/8] Nits: verif docstrings --- C/cblas/verif_dasum.v | 10 +++++----- C/cblas/verif_ddot.v | 13 +++++++------ C/cblas/verif_dscal.v | 15 ++++++++------- 3 files changed, 20 insertions(+), 18 deletions(-) diff --git a/C/cblas/verif_dasum.v b/C/cblas/verif_dasum.v index 4471d55..0938c66 100644 --- a/C/cblas/verif_dasum.v +++ b/C/cblas/verif_dasum.v @@ -3,11 +3,11 @@ (** [body_cblas_dasum] proves that the compiled C [cblas_dasum] (unit stride) refines LAProof's summation model: its result is [feq]-equal to - [sumF (map BABS X)]. Sibling of [verif_ddot]; the loop invariant carries the - partial accumulation [asum_model (sublist 0 k X)], with the per-step / start / - bridge reasoning in [LAProof.C.cblas.asum_model]. The only structural - additions over [ddot] are an [fabs] [forward_call] in the loop body and the - dead [if (incX<=0) return 0;] guard (unreachable since [incX = 1]). *) + [sumF (map BABS X)]. The loop invariant carries the partial accumulation + [asum_model (sublist 0 k X)]; the step and bridge reasoning that discharges + it lives in [LAProof.C.cblas.asum_model]. The loop body makes an [fabs] + [forward_call], and a dead [if (incX<=0) return 0;] guard precedes the loop + (unreachable since [incX = 1]). *) Require Import VST.floyd.proofauto. From vcfloat Require Import FPStdCompCert FPStdLib. diff --git a/C/cblas/verif_ddot.v b/C/cblas/verif_ddot.v index 29db101..1ac765a 100644 --- a/C/cblas/verif_ddot.v +++ b/C/cblas/verif_ddot.v @@ -1,12 +1,13 @@ -(** * LAProof.C.cblas.verif_ddot: VST proof scaffold for GSL's [cblas_ddot]. *) +(** * LAProof.C.cblas.verif_ddot: VST proof for GSL's [cblas_ddot]. *) (** ** Corresponds to C program [C/cblas/src/ddot.c]. *) (** [body_cblas_ddot] proves that the compiled C [cblas_ddot] (unit stride) - refines the [dotprodF] accuracy model, [feq]-equal to its result. The proof - mirrors [LAProof.C.verif_densemat_mult.body_densematn_dotprod]: a - [forward_for_simple_bound] whose invariant carries the partial accumulation - [ddot_model (sublist 0 k X) (sublist 0 k Y)], with the per-step / start / - bridge reasoning factored into the lemmas of [LAProof.C.cblas.ddot_model]. *) + refines the [dotprodF] accuracy model: its result is [feq]-equal to + [dotprodF X Y]. The loop invariant carries the partial accumulation + [ddot_model (sublist 0 k X) (sublist 0 k Y)]; the step and bridge reasoning + that discharges it lives in [LAProof.C.cblas.ddot_model]. Two dead [OFFSET] + guards (the [incX>0]/[incY>0] conditionals) precede the loop, unreachable + since [incX = incY = 1]. *) Require Import VST.floyd.proofauto. From vcfloat Require Import FPStdCompCert FPStdLib. diff --git a/C/cblas/verif_dscal.v b/C/cblas/verif_dscal.v index 262f313..2651116 100644 --- a/C/cblas/verif_dscal.v +++ b/C/cblas/verif_dscal.v @@ -3,10 +3,9 @@ (** [body_cblas_dscal] proves that after the compiled C [cblas_dscal] (unit stride) the array holds exactly [scal_model alpha X] (= [map (fun x => BMULT - x alpha) X]). First in-place routine in this directory: a writable array - with a store in the loop, whose invariant is "scaled prefix ++ original - suffix"; the per-step [upd_Znth]/[sublist] identity is discharged by - [list_solve]. *) + x alpha) X]). This is an in-place routine: a writable array with a store in + the loop, whose invariant is "scaled prefix ++ original suffix"; the per-step + [upd_Znth]/[sublist] identity is discharged by [list_solve]. *) Require Import VST.floyd.proofauto. From vcfloat Require Import FPStdCompCert FPStdLib. @@ -79,6 +78,8 @@ forward_for_simple_bound (Zlength X) Qed. (** Payoff: each element [BMULT (Znth k X) alpha] is the correctly-rounded - product, so its relative error vs the exact product is bounded by the unit - roundoff (vcfloat's [BMULT] accuracy lemma); there is no accumulation, hence - no global/no-overflow hypothesis as in the reduction routines. *) + product, so when that product is finite its relative error vs the exact + product is bounded by the unit roundoff (vcfloat's [BMULT] accuracy lemma). + A product can still overflow, but the finiteness (no-overflow) requirement is + per element -- each [X[k]*alpha] independently -- rather than a single global + hypothesis over an accumulation. *) From ee9e8f385c7598d5a9b94d604f9286c86d85108e Mon Sep 17 00:00:00 2001 From: Ariel Kellison Date: Mon, 1 Jun 2026 07:34:42 -0400 Subject: [PATCH 6/8] doc: update C proof documentation --- C/cblas/src/dasum.c | 8 ++++++++ C/cblas/src/ddot.c | 8 ++++++++ C/cblas/src/dscal.c | 8 ++++++++ C/cblas/src/source_asum_r.h | 9 +++++++++ C/cblas/src/source_dot_r.h | 8 ++++++++ C/cblas/src/source_scal_r.h | 8 ++++++++ index.html | 16 ++++++++++++++++ 7 files changed, 65 insertions(+) diff --git a/C/cblas/src/dasum.c b/C/cblas/src/dasum.c index 8610871..ec02c99 100644 --- a/C/cblas/src/dasum.c +++ b/C/cblas/src/dasum.c @@ -1,3 +1,11 @@ +//ldoc on +/** + * # `dasum.c`: GSL CBLAS real double-precision sum of absolute values + * + * Thin wrapper that instantiates the generic kernel `source_asum_r.h` at type + * `double`. Verified in `LAProof.C.cblas.verif_dasum` against the functional + * model `LAProof.C.cblas.asum_model`. + */ /* The following GSL headers are commented out for clightgen: they are unresolvable here (GSL's gsl/ symlink dir is not generated) and contribute nothing to cblas_dasum's body beyond the libc fabs, which we obtain directly diff --git a/C/cblas/src/ddot.c b/C/cblas/src/ddot.c index e1f5e10..45e8c19 100644 --- a/C/cblas/src/ddot.c +++ b/C/cblas/src/ddot.c @@ -1,3 +1,11 @@ +//ldoc on +/** + * # `ddot.c`: GSL CBLAS real double-precision dot product + * + * Thin wrapper that instantiates the generic kernel `source_dot_r.h` at type + * `double`. Verified in `LAProof.C.cblas.verif_ddot` against the functional + * model `LAProof.C.cblas.ddot_model`. + */ /* The following GSL headers are commented out for clightgen: they are unused by cblas_ddot's body (only OFFSET/INDEX from "cblas.h" are needed) and are unresolvable here because GSL's gsl/ symlink dir is not generated. */ diff --git a/C/cblas/src/dscal.c b/C/cblas/src/dscal.c index 242ad30..4b2e0e6 100644 --- a/C/cblas/src/dscal.c +++ b/C/cblas/src/dscal.c @@ -1,3 +1,11 @@ +//ldoc on +/** + * # `dscal.c`: GSL CBLAS real double-precision scaling (in place) + * + * Thin wrapper that instantiates the generic kernel `source_scal_r.h` at type + * `double`. Verified in `LAProof.C.cblas.verif_dscal` against the functional + * model `LAProof.C.cblas.scal_model`. + */ /* The following GSL headers are commented out for clightgen: they are unresolvable here (GSL's gsl/ symlink dir is not generated) and contribute nothing to cblas_dscal's body (only INDEX from "cblas.h" is needed; the loop diff --git a/C/cblas/src/source_asum_r.h b/C/cblas/src/source_asum_r.h index 863619e..65a3515 100644 --- a/C/cblas/src/source_asum_r.h +++ b/C/cblas/src/source_asum_r.h @@ -17,6 +17,15 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */ +//ldoc on +/** + * ## Accumulation kernel (`source_asum_r.h`) + * + * A forward, left-to-right accumulation of `fabs(X[ix])` into `r`, starting + * from `+0.0`. The `incX <= 0` guard returns `0` for nonpositive stride; with + * unit stride (`incX = 1`) it is dead and `ix` advances one element per + * iteration. + */ { BASE r = 0.0; INDEX i; diff --git a/C/cblas/src/source_dot_r.h b/C/cblas/src/source_dot_r.h index b83f85e..0775c77 100644 --- a/C/cblas/src/source_dot_r.h +++ b/C/cblas/src/source_dot_r.h @@ -17,6 +17,14 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */ +//ldoc on +/** + * ## Accumulation kernel (`source_dot_r.h`) + * + * A forward, left-to-right accumulation with separate multiply-then-add (not + * fused). With unit stride (`incX = incY = 1`) the `OFFSET` conditionals are + * `0` and the indices `ix`, `iy` advance one element per iteration. + */ { ACC_TYPE r = INIT_VAL; INDEX i; diff --git a/C/cblas/src/source_scal_r.h b/C/cblas/src/source_scal_r.h index 81d2300..4bc0fe9 100644 --- a/C/cblas/src/source_scal_r.h +++ b/C/cblas/src/source_scal_r.h @@ -17,6 +17,14 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */ +//ldoc on +/** + * ## Scaling kernel (`source_scal_r.h`) + * + * An in-place elementwise update `X[ix] *= alpha` with no accumulation. The + * `incX <= 0` guard returns early for nonpositive stride; with unit stride + * (`incX = 1`) it is dead and `ix` advances one element per iteration. + */ { INDEX i; INDEX ix = 0; diff --git a/index.html b/index.html index 6cfba60..db2e33a 100644 --- a/index.html +++ b/index.html @@ -62,6 +62,7 @@

C programs

  • bandmat.[c,h]:   Symmetric band matrix operations
  • sparse.[c,h]:   CSR (Compressed Sparse Row) matrix operations
  • build_csr.c:   CSR (Compressed Sparse Row) matrix construction +
  • ddot.c, dasum.c, dscal.c:   GSL CBLAS Level-1 vector routines

    Algorithm Accuracy Proofs

    @@ -163,6 +164,21 @@

    C Program Correctness Proofs

  • Verification of COO->CSR conversion
  • Verified Software Unit +
    BLAS Level-1 (GSL CBLAS): ddot.c, dasum.c, dscal.c +