Skip to content

Explore kernel durability hardening (WAL record headers, fsync-batching mode, expanded DST fault profiles) #9

Description

@cevheri

Summary

Exploratory issue to scope kernel durability hardening — improvements to the
write-ahead log, the durability contract, and the deterministic-simulation
testing (DST) harness — that go a step beyond what the distribution-channels work
(#6) deliberately left untouched.

Context: after #6 the kernel (src/core.ts) is small, runtime-agnostic (zero
node: imports), 100% line/function/statement covered, and its crash/recovery is
already exercised by the DST harness in src/sim/. The WAL is the database
(no separate data file; see ARCHITECTURE.md), with each committed transaction
written as a length-framed, CRC-32'd record and fsync'd before the commit is
exposed. The ideas below are about making that foundation more robust and more
evolvable
without betraying the manifesto — core.ts stays small because it is
genuinely minimal, never because complexity was swept elsewhere.

This is exploratory and not a commitment to build all (or any) of it. Each
candidate must earn its place against the comprehension budget.

Candidate directions

1. Versioned / magic record (or file) header for the WAL

Today a record is [u32 payloadLength][u32 crc32(payload)][payload] with no magic
number and no format version. A small magic + version header would let recovery:

  • Detect a wrong/foreign/corrupt file early with a clear error, instead of
    misparsing arbitrary bytes as a length-framed record.
  • Evolve the on-disk format forward-compatibly (new record kinds/fields,
    alternative codecs) by branching on a version, rather than being frozen.

Open questions: per-file header vs per-record version; how recovery reacts to an
unknown/newer version (refuse vs best-effort); byte overhead vs the "WAL is the
database" minimalism; migration story for files written by today's headerless
format.

2. A documented fsync-batching (group-commit) durability mode

The kernel fsyncs on every commit — correct and simple, but per-commit fsync
caps write throughput. A standard database answer is group commit: let several
commits share one fsync, trading a bounded, explicitly-documented window of
durability for throughput (cf. Postgres synchronous_commit, SQLite WAL).

The point of this issue is as much documentation as implementation: if such a
mode exists it must be opt-in, with the exact durability guarantee spelled out
(what a crash can lose, and when), and it must not quietly weaken the default. It
also has to stay honest with the synchronous, single-process model.

Open questions: API surface (an open() option? a fence/flush() call?); the
precise crash semantics; interaction with #1 (does reordering/batching change what
recovery must tolerate?); whether the default stays fsync-per-commit (it should).

3. Expanded DST fault profiles

The DST harness already tortures recovery under a simulated crashing filesystem.
Extend it with more realistic fault profiles so we prove (not assume) what the
format and recovery survive:

  • Partial writeswrite lands fewer bytes than asked; torn at an arbitrary
    offset, not just a clean tail.
  • Operation reordering — writes/fsyncs reach durable storage out of order
    (especially relevant if ci: report sonar.projectVersion from package.json for new-code baseline #2 introduces batching).
  • Possibly: bit-flip / corruption (CRC-32 should catch; quantify), delayed or
    dropped fsync, and faults injected mid-recovery.

Open questions: which faults are realistic for the real backends we target
(Node/Bun fs, OPFS in a Worker, future adapters); which the current format
already survives vs which motivate #1/#2; keeping the harness deterministic and
fast.

Non-goals

  • Growing core.ts for its own sake. Each change must reduce risk and keep the
    kernel comprehensible — the metric is comprehension time, not line count.
  • Cryptographic tamper-resistance. CRC-32 is error detection, not an integrity
    guarantee against a writer who already has file access (out of the threat model).
  • Anything that compromises the zero-runtime-dependency or
    embedded/single-process posture, or that weakens the default durability.

Constraints (carried from the project)

  • 100% line/function/statement coverage; deterministic tests (the gate stays green).
  • Conventional commits; English-only; no emoji.
  • Changesets for any user-facing change (the published package / public API / types).
  • Decisions that touch core.ts are guarded-core changes: heavy review, and they
    must not contradict docs/DESIGN.md without explicitly reopening the decision.

Required first step: detailed research before any implementation

Before starting any development, a thorough investigation is mandatory. Given that
LibreDB is an embedded, FoundationDB-style architecture (one small ordered
key-value core with thin model lenses on top), the research must establish, for
each candidate above, which durability mechanisms genuinely belong in a database
like this
and which would add complexity the design refuses. The research should
map each candidate to how well it fits the embedded, single-process,
zero-dependency design — and what it costs in comprehension, the kernel's real
budget — before we commit to building anything. Study the prior art closely
(SQLite WAL, FoundationDB, Postgres group commit, libSQL/Turso's DST practice).
The deliverable is a design note (in the spirit of the #6 research doc under
docs/), reviewed and agreed, before any code is written.

Suggested order

  1. Research note covering all three directions (feasibility, fit, cost, prior art).
  2. docs: add CI, quality gate, and coverage badges to README #1 (record/file header) — smallest, enables the rest and improves diagnostics.
  3. ci: add job summaries to CI, SonarCloud, and publish workflows #3 (DST fault profiles) — so any durability change is provable.
  4. ci: report sonar.projectVersion from package.json for new-code baseline #2 (fsync-batching mode) — highest risk; only after docs: add CI, quality gate, and coverage badges to README #1 + ci: add job summaries to CI, SonarCloud, and publish workflows #3 make it safe to reason about.

Related: builds on the WAL/recovery described in ARCHITECTURE.md and the locked
decisions in docs/DESIGN.md; sibling to the distribution-channels work in #6.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions