Skip to content

Split data and control planes, implement daemon, works with smelt#1

Open
frrist wants to merge 8 commits into
mainfrom
split-planes-edge-client-daemon
Open

Split data and control planes, implement daemon, works with smelt#1
frrist wants to merge 8 commits into
mainfrom
split-planes-edge-client-daemon

Conversation

@frrist

@frrist frrist commented Jun 3, 2026

Copy link
Copy Markdown
Member

This branch turns ingot from an in-sprue prototype into a standalone,
Forge-native S3 gateway. It does four related things; the details live in the
docs linked below.

What's in this branch

  • Split the data and control planes. Object bodies (raw chunks) and the
    per-bucket MST catalog (MST nodes, manifests, chunk indexes) are now two
    independent pipelines — each an instance of one logstore.PlaneLog module
    under a thin Store coordinator, with its own seal trigger, ship transport, and
    retention window. So the catalog can be kept local-only while the data plane
    ships to Forge, or any other permutation.

  • Became a Forge edge client. The old "ingot orchestrates allocate/PUT/accept
    directly against piri" uploader is replaced by the guppy-style flow —
    /blob/add → PUT → /ucan/conclude/blob/accept/index/add against
    sprue — carried into forgeclient/ + tokenstore/ (ingot can't import
    guppy/sprue: guppy embeds ingot).

  • Added a daemon. A cobra/viper/fx ingot binary (cmd/): ingot serve in
    standalone (in-memory, no Forge) or forge (Postgres + sprue) mode, plus
    ingot login <email>, ingot space generate (self-provision), and whoami.
    Docker-native; runs as a smelt system.

  • Self-provisioning, no guppy. ingot login + ingot space generate --provision-to <email> mints/provisions/grants ingot's own space against sprue
    (mirrors guppy space generate), so the bootstrap no longer needs guppy.

Docs (start here)

  • README.md — what ingot is, deploy modes, how to run.
  • DESIGN_NOTES.mdhow the whole system operates today: write/read paths, the two planes, the edge-client ship, identity/login/provision, and known gaps.
  • logstore/README.md — the two-plane log internals: lifecycle diagrams, crash recovery, config.
  • CLAUDE.md — building/testing + conventions for working in the repo.

Verification

  • GOWORK=off go build / vet / test ./... green; in-memory S3 smoke suite ~66 pass.
  • Live e2e: smelt's TestIngotNativeProvision cold-boots the Forge stack,
    drives ingot login + ingot space generate --provision-to (no guppy), and
    round-trips an S3 PUT/GET through sprue + piri + indexing-service on a
    self-provisioned space.

Known gaps (multipart, conditional requests, multi-tenancy, a shared forge-client
lib, GC) are tracked at the bottom of DESIGN_NOTES.md.

🤖 Generated with Claude Code

frrist added 6 commits June 3, 2026 12:15
1. Split the data plane (raw object-body chunks) from the catalog plane
   (dag-cbor MST nodes, manifests, chunk indexes) into two CAR files per
   segment, classified by CID codec at OpStaging.Put. Each plane ships and
   retires through an *independent* in-daemon pipeline with per-plane on/off
   + retention; a never-ship plane is retained on local disk forever.
   forge_root_cid advances only when the catalog plane ships. Postgres,
   the logstore segment/flush machinery, and recovery are reworked per
   plane; Log.AppendBatch and logstore.Meta change shape.

2. Replace the sprue-orchestrator uploader with a guppy-style edge client:
   /blob/add -> PUT -> /ucan/conclude -> /blob/accept -> /index/add against
   the upload service (sprue). Ports guppy's client + tokenstore into ingot
   as carried copies (forgeclient/, tokenstore/), reusing internal/ucanexec.
   Deletes the direct-piri allocate/accept path (piri.go/indexer.go/
   provider.go). uploader.SubmitShard ships one plane's CAR + its
   sharded-dag-index.

3. Add a cobra/viper/fx daemon (cmd/ingot) with serve|login|space|whoami and
   two modes: standalone (in-memory inmem package, both planes retained
   locally) and forge (Postgres + sprue + login). Dockerfile + smelt-ready.

Standalone mode verified end-to-end (live aws-s3 PUT/GET, two CARs on disk);
the forge network path is compiled but needs live piri+sprue+indexer.
- seedSpaceDelegations now self-issues space->agent delegations for
  /blob/allocate and /blob/accept (in addition to /blob/add, /index/add,
  /content/retrieve). The upload service re-invokes allocate/accept against
  piri on the space's behalf, so without these the shipped blob/add fails
  with "not issued by subject and has no proofs". Sentinel now keys on
  /blob/allocate so a store seeded before this upgrades on restart.
- Dockerfile ENTRYPOINT drops the "serve" subcommand; the compose `command:`
  supplies it (smelt house style), avoiding "ingot serve serve".

Verified end-to-end against the live smelt Forge stack: aws-s3 PUT ->
data/catalog blob/add + index/add -> aws-s3 GET served from the network
(indexer locate + ranged piri retrieve).
testing/external_test.go (build tag e2e):
- TestExternalVersity runs the versitygw integration Suite against an
  external INGOT_E2E_ENDPOINT (e.g. an ingot daemon deployed in the smelt
  Forge stack), with configurable creds/region/suite.
- TestInProcessVersityBaseline runs the same suite against the in-memory
  harness, so external/forge-mode results can be compared to a clean-ingot
  baseline.

Used to prove forge-mode ingot is S3-conformant to the same degree as the
in-memory build: the Smoke suite is 66 pass / 53 xfail in both. Excluded
from normal `go test` by the e2e tag.
…pace generate`)

ingot could already `login`, but provisioning a space still required shelling
out to `guppy space provision`. Mirror guppy's `space generate` so ingot needs
no guppy: `ingot space generate --provision-to <email>` mints/reuses
<DataDir>/space.key, provisions it to the account on sprue, and grants access.

forgeclient (carried copies from guppy/pkg/client):
- provideradd.go    ProviderAdd     (/provider/add)
- accessdelegate.go AccessDelegate  (/access/delegate)
- accounts.go       Accounts        (logged-in did:mailto accounts from the token store)
- spaces.go         SpaceNameMetadata for --name

cmd/space.go: rewrite `space generate` with guppy's flags
(--name/--grant-to/--provision-to/--output-key/-k, plus --force); resolve
accounts via pickAccount (errors if not logged in, like guppy); ProviderAdd then
grant() (space->agent + space->account Top delegations stored locally +
AccessDelegate to sprue). ingot keeps persisting space.key and reuses it unless
--force (single-space tool); no JSON output mode.
Document the LSM journal: the data/catalog plane split, on-disk layout
(seg-N.{data,cat}.car/.idx + shared .ops), and the open->append->seal->ship
->retain lifecycle. Includes a Mermaid sequenceDiagram (with the three-way
fsync durability barrier and the per-plane guppy-style ship) and a
stateDiagram-v2 (the two planes as concurrent regions), plus the invariants,
read tiers, crash-recovery reconciliation table, config, and a key-symbol map.
@frrist frrist changed the title Split planes and implement daemon Split data and control planes, implement daemon, works with smelt Jun 3, 2026
@frrist frrist self-assigned this Jun 3, 2026
…pipelines

The data and catalog planes were split inside a single shared Segment (two CARs,
one .ops, a shared seal trigger, one Meta row, one seq), so sealing and the
Size() trigger coupled the two planes. Extract a single-plane PlaneLog module and
make Store a thin coordinator over two of them, so each plane seals, ships, and
retains independently. This removes the segPlane / combined-Size() coupling and
sets up the data plane's transport to change later (e.g. co-located
direct-to-piri) without touching the catalog pipeline or the read path.

- logstore: new PlaneLog (per-plane LSM pipeline); Store becomes a coordinator
  whose AppendBatch fsyncs data BEFORE catalog (crash-safety) and routes Get by
  codec; Segment is single-plane (drops segPlane/planeRef and the combined Size).
- config: SealBytes/SealAge move into PlaneConfig; Config gains data_plane /
  catalog_plane yaml blocks with top-level defaults.
- Meta + schema: plane-scoped Meta interface + single-plane SegmentMeta; the
  segments table gains a `plane` column and collapses data_*/cat_* into
  size_bytes/sha256/shipped_at; one shared segment_seq (ids globally unique).
- on-disk: per-plane subdirs segments/{data,catalog}/seg-N.{car,idx,ops}.
- inmem.MemStore, registry/segments.go, server.go (per-plane ServerConfig +
  newPlaneFlushFunc), testing/harness.go updated; per-plane + independent-seal
  logstore tests added.

The write path (OpStaging / bucketop / s3frontend) is untouched: Store keeps
blockstore.Log.AppendBatch's signature.

Docs refreshed to describe the system as it operates today: DESIGN_NOTES.md
rewritten as a concise "how it works" (write/read paths, the two planes,
edge-client ship, identity/login/provision, known gaps); logstore/README.md
rewritten for PlaneLog; README.md + CLAUDE.md updated (library + daemon, two-plane
log, edge client, dropped ProviderSelector/HomeProvider).
@frrist frrist requested review from Peeja and alanshaw June 3, 2026 22:13

@alanshaw alanshaw left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not finished the review but leaving the feedback to far as I have to dash.

Comment thread blockstore/log.go
//
// - PlaneData: the data plane — raw-codec object-body chunks. The
// actual bytes a client GETs.
// - PlaneCatalog: the control plane — the dag-cbor MST nodes,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// - PlaneCatalog: the control plane — the dag-cbor MST nodes,
// - PlaneCatalog: the catalog plane — the dag-cbor MST nodes,

Comment thread blockstore/log.go
// bytes live and how to reconstruct an object.
//
// Block classification is by CID codec: cid.Raw → PlaneData, anything
// else (dag-cbor) → PlaneCatalog. See OpStaging.Put.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about overloading the encoding codec for this...why not have a CatalogBlock and DataBlock?... or a type PlaneBlock struct { block.Block; Plane Plane }?

Comment thread cmd/login.go

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of replicating a lot of the guppy functionality here, why not use the guppy CLI to login and create a space, and then use guppy delegation create ... to delegate usage to Ingot? Then you only need a ingot proof add command here, and you get to remove a lot of code associated with logging in, claiming delegations etc.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I'd prefer to live with this duplication as it simplifies deployment coordination, namely in smelt. There is an implicit task in the DESIGN_NOTES.md wrt this:

Carried Forge-client copiesforgeclient/, tokenstore/,
blockstore/locator/, internal/ucanexec/ duplicate guppy/sprue code to stay
cycle-free (ingot must never import guppy/sprue — guppy embeds ingot). A shared
forge-client library would remove them.

My ideal end state here is a share library we import in both ingot and guppy. Alright with you if we punt this until later?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called sprueclient?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's probably a better name, will change. (This was initially called the "forge client" due to the service running within sprue)

Comment thread inmem/store.go
return nil
}

func (m *MemStore) SetForgeRoot(_ context.Context, name string, root cid.Cid) error {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not include "forge" in the name? What is the difference between CAS root and Forge root?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes will drop forge from the name.
The CAS root is the current local root of the MST. It changes for each mutation of the tree/bucket
The Forge root is the root of the MST that has been shipped to the network for storage. It changes only when local state is shipped.

I am not tied to these names, suggestions welcome.

Comment thread DESIGN_NOTES.md
```

A PUT is **acked** once both planes are fsynced locally and the root CAS lands in
Postgres; it is **durable on Forge** only after the background ship. Blocks split

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is durable on Forge only after the background ship

Do you meant the MST data or the object data also?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently all data. Right now both planes are buffered into CARs until they ship to Piri.

Comment thread DESIGN_NOTES.md

- `Store.AppendBatch` fsyncs **data before catalog**, so a crash never leaves a
durable catalog entry referencing non-durable data.
- **`forge_root_cid` advances only when the catalog plane ships** — catalog roots

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to track this?

Comment thread DESIGN_NOTES.md
handler, sprue does; this is the step that lets `/blob/accept` resolve.
4. poll `/blob/accept` → the `/assert/location` commitment.
5. build a 1-shard sharded-dag-index, `/blob/add` it, then `/index/add` it (sprue
republishes to the indexing-service).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interested how this works for multipart. I feel it cannot be within SubmitShard.

Comment thread DESIGN_NOTES.md
→ stream body via blockstore.Layered:
data / catalog PlaneLog (local segments, newest-first)
→ blockstore.Cached (byte-bounded LRU, Config.ReadCacheBytes)
→ blockstore.Forge: indexer locate(digest) → ranged piri /content/retrieve

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where, for multipart we need the nodes property to exist in the index so that we can serve the shards in the right order.

https://github.com/fil-one/RFC/blob/main/rfcs/2026-04-forge-s3-flat-file-sharding-strategy.md

Comment thread DESIGN_NOTES.md
└─ OpStaging.Commit classify staged blocks by codec, then:
└─ logstore.Store.AppendBatch(dataBlocks, catalogBlocks, opRoot)
├─ data PlaneLog.Append → fsync segments/data/seg-N.car
└─ catalog PlaneLog.Append → fsync segments/catalog/seg-N.car + .ops

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess for MVP we are storing all mutations? Really shipping changes to the catalog should not be a shard fullness thing but a time/number of mutations based thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants