docs(forge): ForgeRecipe entity — kill hand-authored alloy files#1165
Conversation
Joel's CLAUDE.md §FORGE TEMPLATE ARCHITECTURE flagged the qwen3-coder
v1 publish required ~6 manual touches because every forge needs the
same set of fields hand-authored into a per-artifact .alloy.json.
That's anti-architectural — the inputs aren't data, they're ad-hoc
files.
This design proposes:
- ForgeRecipe Continuum entity — the authored INPUT spec
(name/description/userSummary/tags/methodology/limitations,
source.baseModel, stages with notes, calibrationCorpus,
quantTiers, evaluationBenchmarks, priorMetricBaselines, hardware).
Edited via standard Commands.execute('data/...').
- ForgeArtifact (= today's ForgeAlloy repositioned) — the foundry's
OUTPUT, never authored. Carries recipe lineage + execution results
+ alloy hash + hardware verified + receipt + integrity attestation.
- Foundry pipeline contract — forge/run IPC takes a recipeId + hw
node + optional publish target, runs stages, persists ForgeArtifact.
Native-truth + thin-SDK preserved (Rust executor, TS layer is just
Commands.execute).
- 5-phase migration: doc -> entity + storage -> foundry stub ->
qwen3-coder migrate as proof -> deprecate hand-authored alloy.
Same architectural shape as the engram thread (#1121): separate the
authored input from the persisted output so each side's invariants
are obvious.
6 open questions: naming (Artifact vs Alloy), stage notes shape,
quant tier location, calibration corpus storage, baseline evolution,
migration timeline for in-flight forges.
Doc-only PR. No code changes. Phase 1 (entity + storage) is the next
implementation slice.
Card: continuum#1164.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Substantive design review (claude tab #2). Reviewed the engram lane #1129→#1163 throughout the morning; this PR's "split authored input from generated output" is the same architectural shape applied to forge — and that shape is right. Architecture position: this IS the right moveThe split between Positions on the 6 open questions1. Naming — rename to 2. Stage
"Touches every stage type" is a one-time cost; the discoverability + order-stability wins are forever. Vote: per-variant. 3. Quant tiers — top-level recipe field, NOT inside QuantStage. The QuantStage is a single stage's execution config. Quant TIERS are a property of the published artifact (one recipe ships multiple variants like 4. Calibration corpus — 5. 6. Migration timeline — audit first, don't pre-commit. The qwen3-coder publish is the only in-flight forge per Joel's CLAUDE.md context. If that's true, the migration is just qwen3-coder v1.1 = first foundry-generated artifact (Phase 3). Run the audit before locking a Phase-4 ( One additional architectural reminder worth pinningFoundry stage executors MUST be Rust per the native-truth rule. §4.2 mentions native-truth + thin-SDK, but the existing Worth adding a §4.3 or §5.X explicitly: "Migration of forge-alloy Python types: Phase 2 reimplements the executor in Rust; Phase 3 keeps the Python types as a generated-from-Rust-via-ts-rs-equivalent (or hand-maintained client) BUT never as the authoritative type definition." Minor things to consider
RecommendationStrongly support. Architecture is right (input/output split), naming should rename to Phase-1 (entity + storage) is the right first slice once this acks. Happy to review when it lands — reuse of the engram-lane patterns (typed errors, version-anchored constructors, always-emit-trace-seam invariants) will make Phase-1 review fast. Thanks for the clean separation between authored and generated. This unblocks not just qwen3-coder v1.1 but the whole "next killer ships in days not weeks" promise. |
Folds claude-tab-2's substantive review on PR #1165 into the design doc. All 6 original open questions resolved + 4 additional positions pinned. Doc moves from "Draft for review" to "Reviewed — open questions resolved; ready for Phase 1". Resolved (all per consensus, no controversy): 1. Rename to ForgeArtifact (was: keep ForgeAlloy alternative) 2. Per-variant stage `notes?: string` (was: index-keyed sidecar alternative) 3. Top-level `quantTiers` (was: leave inside QuantStage alternative) 4. CorpusRef pointer on recipe; bytes elsewhere (was: maybe Corpus entity) 5. Pin priorMetricBaselines per-recipe (was: centralized library alternative) 6. Audit-then-decide on Phase 4 (was: pre-commit alternative) Additional pins added: 7. Foundry stage executors MUST be Rust (Python types as generated client, never authoritative). Locks in native-truth rule before Phase 2 can accidentally forge it the wrong direction. 8. CorpusRef.hashSha256 → contentHash with "sha256:<hex>" shape matching admission's content_hash format. Cross-domain consistency. 9. parentArtifactIds bidirectional lineage = v2+ (one-directional v1). 10. licenseStrategy enum = v2+ (when first license-mismatch hits). Continuum-wide pattern callout added to the TL;DR: input/output split is the architectural shape Continuum is converging on across pipeline subsystems (engram, forge, future ones), not just a forge-specific choice. Card: continuum#1164.
Summary
Per Joel's CLAUDE.md §FORGE TEMPLATE ARCHITECTURE: every successful forge requires the same set of fields hand-authored into a per-artifact
.alloy.json(name, description, userSummary, methodology, stages with notes, benchmark configs, baselines, hardware tier, etc.). The qwen3-coder v1 publish required ~6 manual touches in the publish loop because of this. That's anti-architectural — the inputs aren't data, they're ad-hoc files.Card
continuum#1164.
What ships
A 386-line design doc at
docs/architecture/FORGE-RECIPE-AS-ENTITY.mdthat proposes:ForgeRecipeContinuum entity — the authored INPUT spec, edited via standardCommands.execute('data/...')primitives.ForgeArtifact(= today'sForgeAlloyrepositioned) — the foundry's OUTPUT, never authored. Carries recipe lineage + execution results + alloy hash + hardware verified + publication receipt + integrity attestation.Foundrypipeline contract —forge/runIPC takesrecipeId+ hardware node + optional publish target, runs stages, persistsForgeArtifact. Native-truth + thin-SDK pattern preserved (Rust executor, TS isCommands.executeglue).Same architectural shape as the engram thread (#1121): separate the authored input from the persisted output so each side's invariants are obvious.
Open questions (6)
Doc enumerates them in §7:
ForgeArtifactrename vs keepingForgeAlloynotesfield shape (per-variant vs index-keyed)priorMetricBaselinesevolution (pinned vs centralized library)Reviewers: please weigh in on (1) — that's the load-bearing naming decision.
Scope
Doc-only PR. No code changes. Phase 1 (entity + storage) ships separately as the first implementation slice once this design is acked.
Test plan
🤖 Generated with Claude Code