Skip to content

Reify nfs when fs synced#1208

Open
AloeareV wants to merge 43 commits into
devfrom
reify_NFS_when_FS_synced
Open

Reify nfs when fs synced#1208
AloeareV wants to merge 43 commits into
devfrom
reify_NFS_when_FS_synced

Conversation

@AloeareV

Copy link
Copy Markdown
Contributor

Builds atop #1150

zancas and others added 30 commits May 26, 2026 13:36
…snapshot

  Pins the target invariant for the lazy-NonFinalizedState collapse (#1096):
  best_chaintip must read the chain tip from the non-finalized snapshot in
  every availability state, never via a validator passthrough.

  The test constructs the cold-start ChainIndexSnapshot::StillSyncingFinalizedState
  variant and asserts best_chaintip reports the real chain tip. It fails today:
  that variant has no snapshot tip, so the StillSyncingFinalizedState arm round-
  trips to the validator and returns the finalized floor (validator_finalized_height)
  instead of the tip — the same fallible path that can surface database_hole.

  After #1096 there is no still-syncing variant; the always-present snapshot
  carries best_tip, read directly with no validator call, and this test will be
  rewritten against a real snapshot_nonfinalized_state() result.

  This is a failing-on-purpose driver, not a surviving characterization test —
  it pins cold-start shape precisely to drive that variant's elimination, so the
  churn it incurs at the fix is intended. The module doc flags it as the one
  exception to this file's "don't pin the still-syncing variant" rule.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  Groundwork for the non-finalized-state rework: introduce the block
  representation the NFS will hold once it is built eagerly from the
  validator, ahead of wiring it into the snapshot.

  A non-finalized block cannot carry absolute cumulative chainwork until the
  finalized DB catches up to the seam — the validator does not expose
  chainwork (both backends drop the field) and zaino accumulates it in the
  FS. So:

  - ProvisionalBlock: the NFS's block. Mirrors IndexedBlock's payload but has
    no absolute chainwork field; it carries cumulative work measured relative
    to the seam (header-derived) and becomes an IndexedBlock via
    into_indexed() once the seam's absolute base is known. Its parent_hash is
    documented untrusted while provisional — the linkage is unvalidated until
    the seam connects.

  - ProvisionalCumulativeWork(ChainWork): a type distinct from absolute
    ChainWork, so passing relative work where absolute is required — or into
    an IndexedBlock's chainwork field — is a compile error rather than a
    naming convention. This is the misattribution guard.

  - block_to_provisional_block / provisional_block_from_parts build it by
    reusing the BlockWithMetadata extractors, now pub(crate): block_work,
    extract_block_data, extract_transactions, create_commitment_tree_data. The
    header-work step is factored into block_work() and shared with the
    absolute path. IndexedBlock's shape is unchanged.

  Not yet wired into NonfinalizedBlockCacheSnapshot (which still stores
  IndexedBlock), so these items are currently unused; the storage swap and the
  sync/query rewrite land in the following commit.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  Wire the validator-sourced, in-memory-only ProvisionalBlock through the
  non-finalized state, replacing IndexedBlock as the NFS's block type. The NFS
  now tracks work *relative to the seam* and never depends on absolute
  chainwork, which is unavailable until the finalized DB catches up.

  NFS internals (non_finalised_state.rs):
  - NonfinalizedBlockCacheSnapshot.blocks: HashMap<_, ProvisionalBlock>.
  - from_initial_block seeds the seam anchor via ProvisionalBlock::from_indexed_seam
    (relative work = 0); add_block/add_block_new_chaintip take ProvisionalBlock.
  - The Block reorg-builder trait gains to_provisional_block (replacing
    to_indexed_block); impl moves from IndexedBlock to ProvisionalBlock.
  - sync, handle_reorg, add_nonbest_block build provisional blocks; update's
    best-tip max_by_key keys on provisional_cumulative_work.

  Query surface (chain_index.rs):
  - NonFinalizedSnapshot::get_chainblock_by_* return &ProvisionalBlock; the trait
    is now pub(crate) (in-crate only, not re-exported).
  - ProvisionalBlock::to_compact_block and IndexedBlock::to_compact_block share
    one chainwork-free assembly, compact_block_from_parts (no duplication).
  - get_indexed_block_by_*/blocks_containing_transaction unify the finalized
    (IndexedBlock) and non-finalized (ProvisionalBlock) layers behind a ChainBlock
    view exposing hash/height/data.

  Interim, resolved by the follow-up Availability step (#1096): ChainBlock and the
  get_indexed_block_by_* returns become IndexedBlock once the FS is fully synced
  (Resolved); ProvisionalBlock::into_indexed / ProvisionalCumulativeWork::resolve
  are the resolution bridge and are currently unused. The red-driver test stays
  intentionally failing until best_chaintip reads best_tip.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  After the NFS switched to ProvisionalBlock storage, three items have no
  remaining callers:

  - block_to_chainblock: the old IndexedBlock sync-builder, superseded by
    block_to_provisional_block.
  - ProvisionalBlock::into_indexed / ProvisionalCumulativeWork::resolve: the
    forward Provisional -> absolute-chainwork conversion. It has no consumer
    until the NFS query surface resolves to IndexedBlock once the finalized
    state is fully synced (the Availability step, #1096); re-added there. The
    inverse, from_indexed_seam, stays (it builds the seam anchor and is live).
  - the now-unused BlockContext import (only into_indexed referenced it).

  No behavior change.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e step

  will key on. It rides inside NonfinalizedBlockCacheSnapshot so it flips
  atomically with the CAS (compare-and-swap) that publishes the snapshot —
  resolution state and block contents become visible in one indivisible step.

  - SnapshotAvailability::Provisional { validator_finalized_height }: the
    finalized DB has not reached the seam; the window's prev-hash linkage is
    unvalidated and its blocks have no absolute chainwork, so finalized-range
    reads fall back to the validator up to validator_finalized_height.
  - SnapshotAvailability::Resolved { cumulative_chainwork_base }: the finalized
    DB has reached the seam; the seam block's absolute cumulative work is the
    base from which any window block's absolute chainwork derives
    (absolute = base + relative).

  Lifecycle: from_initial_block seeds Provisional; update() flips to Resolved
  exactly when the trimmed window still contains a block at the finalized-DB tip
  (seam overlap), stamping that FS block's absolute chainwork as the base —
  otherwise stays Provisional. The flip is computed before the CAS
  (compare-and-swap) swap, so it publishes with the snapshot.

  Foundation only: the availability fields are written but not yet read (a
  transient dead_code "field never read"), consumed by the following
  enum-collapse step that makes the query surface availability-aware.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…te now

  always exists; a snapshot always carries blocks plus a SnapshotAvailability
  saying whether the finalized DB has caught up to its seam.

  Data model:
  - ChainIndexSnapshot: enum -> struct { non_finalized_snapshot }. The
    availability rides inside NonfinalizedBlockCacheSnapshot, so it flips with the
    snapshot's CAS (compare-and-swap) publish in `update` — resolution state and
    block contents become visible in one indivisible step.
  - SnapshotAvailability { Provisional { validator_finalized_height } | Resolved }.
    from_initial_block seeds Provisional; update() flips to Resolved when the
    trimmed window still contains a block at the finalized-DB tip (seam overlap).
    Resolved is a unit variant for now — the seam's absolute-chainwork base (and
    its reader) return with the resolution-promotion step.
  - non_finalized_state: Arc<ArcSwapOption<NFS>> -> Arc<NFS>, constructed eagerly
    in new_with_sync_timings. The sync loop's lazy `.expect("todo")` init and its
    dead `network` captures are gone; it just calls nfs.sync(...).
  - snapshot_nonfinalized_state is now infallible (the NFS is always present).

  Accessors / consumers:
  - get_nfs_snapshot() -> &NonfinalizedBlockCacheSnapshot (non-optional);
    is_resolved(); availability(); resolved_nfs_snapshot() -> Option<&...> (Some
    only when Resolved) for the "bail unless authoritative" call sites.
  - The ~16 match arms are availability-keyed: Resolved reads the NFS, Provisional
    keeps the validator passthrough. Arms that returned None *only* because the
    old design lacked an NFS (get_compact_block, get_mempool_height) now serve
    from the always-present NFS in both states; the finalized-gap remains the
    #1066 passthrough site.
  - best_chaintip reads best_tip from the snapshot directly in every state — no
    validator round-trip, no database_hole. The former #1096 RED driver is
    rewritten as a passing regression (best_chaintip_derives_tip_from_nfs_snapshot).

  The pub ChainIndex trait's surface references the in-crate ChainBlock /
  NonFinalizedSnapshot types; that lint is #[allow]ed (interim) rather than
  narrowing the public trait, since it's a re-exported API reachable by crates
  outside this workspace.

  Deferred: resolution promotion — re-add the seam base + into_indexed so
  get_indexed_block_by_*/ChainBlock return IndexedBlock when Resolved.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lization ceiling (#1096)

  The non-finalized state (NFS) is now validator-sourced and always leads the
  finalized DB. Instead of being seeded from genesis and re-seeded from the
  finalized-DB tip, it anchors its window at the finalization ceiling
  (best_tip - NON_FINALIZED_DEPTH), fetching the seam block from the source, so
  it only ever walks the non-finalized window and never depends on finalized-DB
  progress. The sync loop runs nfs.sync before fs.sync_to_height; the finalized
  DB catches up to the ceiling, and a snapshot is Provisional until it does.

  Boundary helpers:
  - Replace the `max_serviceable_height` snapshot method and the
    `validator_finalized_height`-carrying `SnapshotAvailability::Provisional`
    field with two free functions: `finalization_ceiling(best_tip) -> Height`
    and `is_finalized(best_tip, height) -> bool`. Remove `finalized_height_floor`
    (subsumed by `finalization_ceiling`).

  Query surface:
  - Restore reorg-safe validator passthrough for the catch-up gap in
    get_block_height / get_block_hash / find_fork_point / get_transaction_status:
    serve NFS-own-data ∪ finalized DB first, then passthrough for heights at or
    below the finalization ceiling (is_finalized), never above it.

  update() / listener:
  - Trim the window to the seam (finalization_ceiling(best_tip)) rather than the
    lagging finalized-DB tip, so the floor advances with the chain (also fixes
    the #1126 eviction regression test).
  - Skip sub-seam blocks in handle_nfs_change_listener; they are finalized, and
    processing them recursed past genesis and errored with MissingBlockError.

  Tests:
  - Add a `BlockchainSource::finalized_sync_cap` hook (default: no cap) so the
    proptest mock holds the finalized DB below the ceiling deterministically,
    replacing the per-call sleep with a gate. The passthrough_* tests run with no
    artificial delays and assert every block is served (NFS window ∪ passthrough
    gap). The mock's get_commitment_tree_roots no longer rebuilds the commitment
    tree from genesis per call (the O(N²) hashing the routing tests don't need).

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ommitment roots

  Add `resolved_snapshot_serves_every_block` (non_finalised_state.rs), the
  Resolved counterpart to the Provisional `passthrough_*` tests. On the
  real-vector harness (Active mode → synced to 150, seam 50) the snapshot
  resolves with a genuine finalized-DB ∪ NFS-window split, and the test asserts
  every height 0..=150 round-trips through get_block_hash / get_block_height /
  find_fork_point with no gap.

  It lives on the real-vector source, not the proptest mock: the proptest mock
  generates arbitrary blocks that aren't a valid UTXO chain, so a real finalized
  sync over them fails the finalized DB's txout-set accumulator (and its
  commitment-tree checks). The proptest `passthrough_*` tests only work because
  they keep the finalized DB empty.

  Cache the proptest mock's `get_commitment_tree_roots`: the NFS sync calls it
  once per block while walking the window, and folding the Sapling/Orchard
  frontier from genesis on every call was O(N) hashing per call — O(N²) across
  the walk, the dominant cost of these tests. Precompute every block's roots in
  one incremental pass, keyed by hash, for O(1) lookups.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ommitment roots

  Add `resolved_snapshot_serves_every_block` (non_finalised_state.rs), the
  Resolved counterpart to the Provisional `passthrough_*` tests. On the
  real-vector harness (Active mode → synced to 150, seam 50) the snapshot
  resolves with a genuine finalized-DB ∪ NFS-window split, and the test asserts
  every height 0..=150 round-trips through get_block_hash / get_block_height /
  find_fork_point with no gap.

  It lives on the real-vector source, not the proptest mock: the proptest mock
  generates arbitrary blocks that aren't a valid UTXO chain, so a real finalized
  sync over them fails the finalized DB's txout-set accumulator (and its
  commitment-tree checks). The proptest `passthrough_*` tests only work because
  they keep the finalized DB empty.

  Cache the proptest mock's `get_commitment_tree_roots`: the NFS sync calls it
  once per block while walking the window, and folding the Sapling/Orchard
  frontier from genesis on every call was O(N) hashing per call — O(N²) across
  the walk, the dominant cost of these tests. Precompute every block's roots in
  one incremental pass, keyed by hash, for O(1) lookups.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ly when Resolved

  The LightWallet (fetch.rs) and State (state.rs) backends guarded chain-height
  reads on `resolved_nfs_snapshot()`, returning `UnavailableNotSyncedEnough`
  (or, in get_latest_block, matching `SnapshotAvailability`) while the snapshot
  was Provisional. The NFS is always present and validator-sourced (#1096), so
  read its `best_tip` directly via `get_nfs_snapshot()`:

  - state.rs: get_block_range, get_block_count, get_latest_block, get_block
    (Ok(None) and Err branches), and get_mempool_stream no longer error during
    Provisional.
  - fetch.rs: get_latest_block drops its `SnapshotAvailability` match; the six
    `resolved_nfs_snapshot()` guards become `get_nfs_snapshot()`; the now-unused
    `SnapshotAvailability` import is removed.
  - get_chain_tips (both backends) keeps `resolved_nfs_snapshot()` — it has a
    legitimate fallback to the RPC client when Provisional.

  zaino-testutils: the three `PollableTip::tip_height` impls return 0 on error
  instead of `.expect()`-panicking, so startup polling tolerates a not-yet-ready
  backend rather than crashing.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@AloeareV AloeareV force-pushed the reify_NFS_when_FS_synced branch from 59b7424 to 27cc7c1 Compare June 19, 2026 18:12
zancas added a commit that referenced this pull request Jun 22, 2026
…ceiling

Both the init path and sync()'s re-anchor used to seed/re-anchor the NFS from genesis (on a finalised-DB miss) or from the lagging finalised tip, so a far-behind cold start crawled the whole chain one block at a time toward the tip and never converged (#1261). They now seed the seam block from the source at the finalization ceiling (chain_tip - NON_FINALIZED_DEPTH) via a shared get_seam_indexed_block helper, so the NFS holds only the reorg window and never waits for the finalised DB. Adds a narrow regression test asserting initialize(.., None) anchors at the ceiling, not genesis.

Approach mirrors the reify_NFS_when_FS_synced draft (#1208).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
zancas added a commit that referenced this pull request Jun 22, 2026
…ceiling

Behaviour-preserving rename to the name (and param) used by the reify_NFS_when_FS_synced draft (#1208), so the function and its call sites converge with it on merge instead of conflicting. Adopts that draft's clearer 'ceiling of the finalized chain' framing in the doc, but keeps the correct reorg-asymmetry note (#1128) rather than the draft's inaccurate 'monotonically increases' claim. Adds CONTEXT.md recording the canonical term.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@AloeareV AloeareV marked this pull request as ready for review July 1, 2026 20:43
@zancas zancas requested a review from idky137 July 2, 2026 06:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants