Skip to content

Epic: ION fast-sync overlay — trustless anchor bootstrap + content-availability gossip #111

Description

@LiranCohen

Problem

A fresh ion-node's bootstrap cost is dominated by downloading every full block from ION activation (mainnet 667,000) to tip and scanning each for OP_RETURN anchors — BIP158 filters exclude OP_RETURN, so neutrino/compact-filters can't help. That's hundreds of GB and, worse, much of it is no longer available: most Bitcoin peers prune, so historical full-block serving is poor and rate-limited. ION also has a second availability weakness — anchored content can disappear from IPFS when it's unpinned/GC'd (we observe this live as pendingRetryable > 0).

This epic builds an ION-native overlay so nodes can discover each other and bootstrap a new node from a peer's already-computed anchor index + content — fast, and resilient to both Bitcoin pruning and IPFS unpinning.

The trust reframe (what's verifiable vs. trusted)

An anchor decomposes into three claims with very different trust properties:

Claim Trustless? How
Anchor tx exists at block H, index I ✅ yes, cheap Merkle inclusion proof against the node's own PoW-verified header chain (~12 hashes + raw tx)
Anchored content (Sidetree files) is authentic ✅ yes, trivial CID is the content hash — content-addressed, self-verifying
The peer sent all anchors (no omissions) ❌ no Would require seeing every tx in every block = the download we're avoiding

So a peer cannot forge a DID — inclusion proofs + content-addressing make it impossible, and the node applies its own validity rules (#99 op-count, #69 writer-drop) to the provided txs, so it is its own validity authority. The only attack is omission (hiding an anchor to fake non-existence or roll back an update/deactivate).

This is the same trust model the codebase already uses for --esplora-api: untrusted hint, verified against our own PoW chain, fails closed. Fast-sync generalizes it from one tx to the whole anchor set.

Omission defenses (layered)

  1. Trust-then-verify (primary). Serve immediately but mark answers provisional (published:false — the existing late-publishing model), and run the slow full-block scan in the background, reconciling. A discovered omission → ban the peer, correct state. When background scan catches up → drop provisional = fully trustless, just eventually.
  2. Union over diverse peers. Each anchor is inclusion-proven, so the union of several ION peers' anchor sets is safe to take — omission then needs all of them to collude on the same anchor. Reuse netgroup diversity.
  3. Always full-verify the tip. Recent blocks are downloaded in full anyway (live anchor detection), so the trust window is historical only.
  4. Random sampling audits. Fully download a random sample of historical blocks and check the peer reported the same anchors — bounds how much a peer can omit before being caught.

The trust anchor under all of it is the node's own PoW header chain (synced from diverse peers, hardened by #1/#4). The overlay can only omit, never forge — so eclipse of the overlay degrades to "slow/incomplete," not "wrong."

Locked decisions

  • Trust posture: eventually-fully-trustless via trust-then-verify (matches the provisional model).
  • Goal: both speed and content availability; build the foundational trustless slice (verifiable anchor bundles) first.
  • Scope: bootstrap pull first; architecture leaves room for ongoing anchor/content gossip between running nodes.

Architecture / work breakdown (child issues)

  1. Merkle inclusion proofs (internal/chain) — the pure primitive: compute + verify a Bitcoin merkle branch against a header's merkle root. No deps. [first]
  2. Verifiable anchor bundles (store + indexer) — retain the merkle branch (+ anchor tx + its first-input prevout tx + that prevout's branch) at scan time, so a node can later serve a self-contained, PoW-verifiable bundle (~10–20 MB for the whole index). Opt-in "serve fast-sync."
  3. ION overlay discovery & handshake (p2p) — an ION service bit (advertised in version/addrv2), a capability handshake gating the custom messages, and bootstrap/seed nodes (don't rely on vanilla Bitcoin gossip alone).
  4. Fast-sync wire protocol (p2p) — getanchors/anchors (range → verifiable bundles) and getcas/cas (CID → content blob).
  5. Fast-sync client + provisional serving + background reconciliation — pull bundles, verify inclusion against own headers, apply own validity rules, project as provisional; background full-scan reconciles & drops provisional; omission → ban.
  6. CAS content gossip (anti-unpinning) — serve/fetch content-addressed blobs between ION peers, self-verifying by CID; cures IPFS unavailability. Arguably the biggest standalone win.
  7. Preferential two-tier peering — a preferred ION-peer pool layered over a diverse Bitcoin-peer pool, without shrinking header-chain diversity (the eclipse trust anchor). Two pools, two purposes.

Fit / prior art

No anchor-gossip overlay exists in reference ION (it re-scans Bitcoin), so this is net-new. It reuses the codebase's existing trust philosophy (untrusted-hint-verified-against-PoW), the provisional/published:false model, and the netgroup-diversity peer logic.

Child issues will be linked below.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions