Skip to content

[spine] Promote TripletGraph + AriGraph tissue to contract surface; add Spine trait #331

@AdaWorldAPI

Description

@AdaWorldAPI

Worker: A1 (ensemble: ontology spine, Phase 1 entry point)
Mnemonic IDs blocked by this: B1, B2, B3, B4, C1

Why

lance-graph is the obligatory spine (CLAUDE.md, line 4: "The obligatory spine — query engine, codec stack, semantic transformer, and orchestration contract"). The AriGraph tissue (TripletGraph + EpisodicMemory + GraphSensorium + NARS deduction) is the canonical knowledge representation of the spine — but today it lives in crates/lance-graph/src/graph/arigraph/ (the implementation crate), not in crates/lance-graph-contract/ (the contract crate that downstream consumers depend on).

Consequence: downstream consumers (q2, MedCare-rs, smb-office, lance-graph-callcenter) cannot speak the spine's native language without depending on the heavy lance-graph crate. Today this manifests concretely as q2 carrying its own duplicate NARS implementation — the "duplicate-thinking trap" called out in MedCare-rs CLAUDE.md, Architectural Commitment #4: "thinking only in lance-graph."

The contract crate already speaks about arigraph (contract::sensorium::GraphSignals is documented as "Produced by arigraph::sensorium::GraphSensorium::from_graph()") — but the producer types themselves are not on the contract surface. This issue closes that gap.

What

Promote the AriGraph tissue to the contract surface as canonical types and define a Spine trait that downstream crates implement in their own crates.

Concrete moves / re-exports into lance-graph-contract

Confirmed canonical names from crates/lance-graph/src/graph/arigraph/:

Type Currently lives in Proposed contract module
Triplet lance-graph::graph::arigraph::triplet_graph lance_graph_contract::triplet
TripletGraph lance-graph::graph::arigraph::triplet_graph lance_graph_contract::triplet
Episode lance-graph::graph::arigraph::episodic lance_graph_contract::episodic
EpisodicMemory lance-graph::graph::arigraph::episodic lance_graph_contract::episodic
GraphSensorium (concrete) lance-graph::graph::arigraph::sensorium merge into contract::sensorium alongside existing GraphSignals DTO
NodeId, EdgeRef (TBD — likely synthesized; see Open Q below) lance_graph_contract::triplet

Two viable shapes (pick one in design pass, not blocking):

  1. Full move — types live in contract; lance-graph re-exports for compat.
  2. Mirror + From/Into — contract owns wire-stable DTOs; lance-graph keeps richer impls and round-trips via From/Into. Aligns better with zero-dep contract goal in lance-graph-contract/src/lib.rs.

Define Spine trait

// lance_graph_contract::spine
pub trait Spine {
    /// Project this tissue's facts into the canonical TripletGraph spine.
    /// Lossless for facts the spine knows how to represent.
    fn project_into(&self, target: &mut TripletGraph) -> Result<(), SpineError>;

    /// Round-trip self-check: project_into + reconstruct must equal self
    /// for the subset of state the contract cares about.
    fn verify_roundtrip(&self) -> Result<(), SpineError>;
}

Minimum surface — only those two methods. Per-flesh extensions (e.g. clinical hooks, billing hooks, transcode hooks) are added in the impl crates, not the contract.

SoA DTO surface

Triplet shape is the canonical read shape. Columnar materialization (Arrow RecordBatch-aligned, zero-copy) is the canonical wire. Specifically:

  • Default DTO is SoA columnar (subject column, predicate column, object column, truth-frequency column, truth-confidence column, timestamp column).
  • No Vec AoS on the contract surface for bulk reads — that's a perf regression and breaks Arrow zero-copy.
  • Triplet (struct, AoS) is kept as the single-fact convenience shape; bulk operations go through SoA.

Architecture

                 ┌─── lance-graph-contract ───────────────────────┐
                 │  TripletGraph, Triplet, NodeId, EdgeRef        │
                 │  EpisodicMemory, Episode                       │
                 │  GraphSensorium (concrete) + GraphSignals (DTO)│
                 │  Spine trait { project_into, verify_roundtrip }│
                 └──┬──────────────┬──────────────┬───────────────┘
                    │              │              │
       impl Spine for     impl Spine for     impl Spine for
       MedCareSpine       SmbOfficeSpine     CallcenterSpine
       (medcare-rs)       (smb-office)       (lance-graph-callcenter)
       — clinical flesh   — invoicing flesh   — Supabase realtime flesh
                    │              │              │
                    └──────────────┴──────────────┘
                                   │
                          ┌────────▼─────────┐
                          │ medcare-bridge   │ ← B4 (registry+orchestrator,
                          │ (unified bridge) │    NOT a god crate; contract
                          └──────────────────┘    only knows Spine impls exist)

The contract knows that Spine impls exist and round-trip. It does not know what flesh is attached. This is the inversion that unblocks every downstream consumer:

  • Inner ontology (medcare-rs clinical, smb-office invoicing) → impl Spine in their own crates.
  • Outer ontology (lance-graph-callcenter Supabase realtime transcode) → impl Spine in its own crate.
  • Unified bridge (B4) → registry+orchestrator over Box<dyn Spine>, not a monolith.

Acceptance criteria

  • cargo check -p lance-graph-contract passes; the contract crate exports Triplet, TripletGraph, Episode, EpisodicMemory, GraphSensorium, Spine.
  • cargo check -p lance-graph passes after the move/re-export; arigraph tissue continues to compile against the contract types (proves single source of truth).
  • Round-trip test: TripletGraph populated → projected into a fresh TripletGraph via Spine::project_into → equality holds modulo documented lossy fields.
  • verify_roundtrip returns Ok(()) for TripletGraph's identity impl.
  • SoA columnar DTO has Arrow schema spec written down (subject/predicate/object/truth_freq/truth_conf/timestamp). No Vec<Triplet> in any new bulk-read API.
  • Downstream proof: q2 can delete its local NARS impl on the strength of this PR (only the import path changes; semantics unchanged). Verified by a follow-up dry-run patch in q2 referenced from B1.
  • Downstream proof: medcare-bridge (B4) can be specified on top of Spine without further contract changes.
  • contract::sensorium::GraphSignals doc-comment "Produced by arigraph::sensorium::GraphSensorium::from_graph()" is rewritten to point at the now-contract GraphSensorium.

Out of scope

  • Actual q2 refactor to drop its local NARS impl → tracked as B1.
  • impl Spine for MedCareSpine (clinical flesh) → tracked as B2.
  • impl Spine for SmbOfficeSpine / CallcenterSpine (invoicing / transcode flesh) → tracked as B3.
  • Unified bridge crate (medcare-bridge registry+orchestrator) → tracked as B4.
  • Per-consumer wire format / transport (gRPC, JSON, Arrow Flight) → tracked as C1.

Dependencies

  • Blocks: B1 (q2 NARS deletion), B2 (medcare Spine impl), B3 (smb-office + callcenter Spine impls), B4 (unified bridge), C1 (wire transport).
  • Blocked by: nothing in this ensemble — Phase 1 entry point.

Open questions for reviewers

  1. NodeId / EdgeRef — these names appear in the architectural brief but I did not find them as free-standing types in arigraph::triplet_graph (entities are addressed by String name; indices into triplets: Vec<Triplet> serve as EdgeRef-equivalent). Reviewers: should we synthesize first-class NodeId(Fingerprint?)/EdgeRef types as part of this promotion, or leave the contract surface using String + usize as today and revisit in B-phase?
  2. Full move vs mirror+From/Intolance-graph-contract/src/lib.rs advertises itself as "zero-dependency trait crate." A full move of TripletGraph (HashMap-heavy concrete struct) into the contract drags std::collections into the contract, which is fine, but also means every consumer compiles the concrete data structure. Mirror+From/Into keeps contract slim. Recommend the mirror route — but call this out for explicit decision.
  3. Sensorium overlapcontract::sensorium::GraphSignals already exists (normalized DTO). Concrete arigraph::sensorium::GraphSensorium has the same fields plus raw counts (active_triplets, total_entities, contradictions). Promote the concrete one and deprecate GraphSignals, or keep both with a From<GraphSensorium> for GraphSignals? Recommend the From impl path.
  4. Scope of Spine::project_into semantics — the brief specifies signature only. What does "project" mean for a MedCareSpine whose flesh is mostly not triplets (e.g. structured FHIR observations)? Proposal: project means "emit the triplet view of self" — flesh chooses what becomes a triplet. Reviewers should confirm this informal semantics is enough for the contract surface, or if we need a stricter formal spec.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions