Skip to content

Storage-primary Anvil release#7

Merged
zcourts merged 27 commits into
mainfrom
internalised-meta
Jun 29, 2026
Merged

Storage-primary Anvil release#7
zcourts merged 27 commits into
mainfrom
internalised-meta

Conversation

@zcourts

@zcourts zcourts commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR lands the storage-primary Anvil rewrite and prepares the release branch for publication. Anvil is now presented and packaged as an object-native storage platform where object bytes, metadata, indexes, watches, relationship authorisation, source artefacts, model artefacts, and PersonalDB witnessing are all part of the same storage system.

The release surfaces are deliberately split:

  • the Anvil server ships as a Docker image and release binaries;
  • the public Rust client ships as the anvil-storage crate;
  • documentation ships as a Fission static site through an independent GitHub Pages workflow;
  • GitHub releases are finalised after the Docker image, documentation site, and Rust client crate are published.

Fundamental Architecture Shift

The central change is the move away from using an external relational metadata store as Anvil's source of truth. Anvil-owned state is now modelled as object-native storage state: journals, manifests, signed records, segments, fences, watch logs, derived indexes, checkpoints, certificates, and repair evidence.

This gives Anvil a single persistence model for:

  • bucket and object metadata;
  • path and prefix navigation;
  • object versions and delete markers;
  • full text and vector index state;
  • relationship authorisation tuples and derived usersets;
  • watch cursors and replay windows;
  • PersonalDB groups, changesets, certificates, heads, snapshots, projections, and repair findings;
  • git source indexes, source pack records, model artefacts, Hugging Face ingestion records, and media extraction diagnostics.

High-Level Features In This Release

Object Storage

  • Bucket lifecycle APIs.
  • Object put/get/head/delete/copy/compose flows.
  • Object versions, checksums, range reads, metadata, and retained object state.
  • Multipart-compatible behaviour and retained upload state.
  • S3-compatible gateway backed by the same native metadata and authorisation model as gRPC.

Object-Native Metadata and Indexes

  • Storage-native metadata journals and directory segments.
  • Prefix navigation and metadata filtering.
  • Derived index records and diagnostic records.
  • Watch-backed maintenance paths for derived data.
  • Storage-native recovery evidence and compaction checkpoints.

Full Text Search

  • Full text segment storage.
  • Token and phrase query support.
  • Ranking and match metadata.
  • Authorisation-safe result filtering.
  • Derived content extraction paths for searchable payloads.

Vector Search

  • Rust-native vector segment storage.
  • HNSW graph validation and query paths.
  • Text, image, audio, and video modality support.
  • Authorisation filtering over vector candidates.
  • Query support over derived embeddings produced from stored objects.

Relationship Authorisation

  • Zanzibar-style namespace definitions and relationship tuples.
  • Caveat hash validation.
  • Derived userset indexes.
  • Watch streams for tuple and derived state.
  • Fail-closed reserved internal namespaces.
  • Permission checks across object APIs, S3, indexes, source artefacts, and PersonalDB APIs.

Watch Streams

  • Durable cursor-based watch streams for object, bucket, index, authz, source, and PersonalDB state.
  • Watch checkpoints and replay windows.
  • Resume validation and stale cursor behaviour.
  • Watch-driven maintenance for derived state.

PersonalDB Witnessing

  • SQLite changeset validation and canonical envelope generation.
  • Commit certificate sealing and verification.
  • Group heads, snapshots, catch-up, and repair evidence.
  • Row metadata and row indexes.
  • Authorised projections and projection writeback.
  • Watch streams for PersonalDB groups and projections.

Source and Model Artefacts

  • Git pack/source index parsing and query support.
  • Source tree/blob lookup APIs.
  • Hugging Face model ingestion flows.
  • Model manifests, tensor metadata, and ingestion item state.
  • Media extraction records and diagnostics.

Rust Client

  • Public Rust client crate renamed and prepared as anvil-storage.
  • Public gRPC service clients exposed through the crate.
  • Internal node-to-node shard service and shard messages excluded from the public client package.
  • Bearer token metadata marked sensitive and redacted from debug output.

Documentation

  • README rewritten to match the current architecture and release surfaces.
  • Detailed release notes added under docs/releases/.
  • Fission documentation site checked and built from documentation/.
  • Independent GitHub Pages workflow added for docs-only publication.

CI and Publication

  • CI validates the publishable Rust client crate with cargo publish --dry-run -p anvil-storage.
  • CI builds the server workspace and Docker image.
  • Main-branch CI publishes the Docker image to GitHub Container Registry.
  • GitHub release creation is removed from CI so releases can be finalised only after the docs and Rust client are published.

Verification

Local verification before opening this PR:

cargo fmt --all -- --check
cargo test --workspace
cargo publish --dry-run --allow-dirty -p anvil-storage
cargo run -p anvil-documentation -- check --project-dir documentation --release
cargo run -p anvil-documentation -- build --project-dir documentation --release
git diff --check

Independent security review confirmed the release-blocking findings are addressed:

  • bearer tokens are redacted and marked sensitive in the Rust client;
  • public Rust client API/proto no longer exposes internal node-to-node shard service messages;
  • SigV4 requests have bounded freshness checks;
  • private object GET and HEAD paths check read authorisation before metadata lookup.

@zcourts zcourts merged commit ac4513c into main Jun 29, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant