feat(taxonomy): versioned artifact + lineage + handshake groundwork (item 4 phase 1)#49
Merged
Merged
Conversation
…undwork (item 4 phase 1)
Establishes the versioned shared-taxonomy artifact contract (the safe,
additive half of item 4). The content reconciliation between Local's
curated 26-skill set and Cloud's 618-skill O*NET superset is product-
shaping and is GATED on a maintainer decision — see
docs/taxonomy-artifact.md §6.
- taxonomy.json is now a versioned envelope { version, lineage, skills }
(v1, lineage 'local-curated'). The loader accepts both the envelope and
the legacy bare array (reports version 0), so it's backward-compatible.
- load_taxonomy_version() + TAXONOMY_LINEAGE expose the client side of the
version handshake. Lineage discriminates WHICH taxonomy this is, so a
client never mistakes two lineages that share a version number for being
aligned (Local 'local-curated' v1 != Cloud 'cloud-onet' v1).
- docs/taxonomy-artifact.md specifies the format, the hybrid
distribution + handshake, the measured reconciliation analysis
(26 vs 618, 24 overlap, 2 naming mismatches, edge-model + UUID-preserve
constraints), and the phased adoption plan with the human gate at the
content flip.
Cloud's matching server-side handshake (meta.taxonomy) ships separately.
Tests: 3 new (envelope + legacy-array parse); full suite 625 passed; ruff clean.
CI's mypy (2.1.0, strict generics) rejected the bare `list | dict` annotations on the new _load_raw/_entries_from_raw/_version_from_raw helpers (Missing type arguments). Parsed JSON is genuinely unknown-shape, so Any is the correct annotation — the isinstance checks narrow it. No runtime change. (I'd skipped mypy locally for the prior commit; CI caught it.)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Item 4, phase 1 — the safe, additive half of the versioned shared-taxonomy artifact. Establishes the format + versioning + the client side of the version handshake. The content reconciliation between Local's curated 26-skill set and Cloud's 618-skill O*NET superset is product-shaping and explicitly gated on a maintainer decision — see
docs/taxonomy-artifact.md§6.What ships here (no skill-set change)
taxonomy.jsonis now a versioned envelope{ version, lineage, skills }(v1,local-curated). The loader accepts both the envelope and the legacy bare array (reports version0) — backward-compatible.load_taxonomy_version()+TAXONOMY_LINEAGEexpose the client side of the handshake. Lineage discriminates which taxonomy this is, so a client never mistakes two lineages sharing a version number for being aligned (Locallocal-curatedv1 ≠ Cloudcloud-onetv1).docs/taxonomy-artifact.md: the format spec, the hybrid distribution + handshake protocol, the measured reconciliation analysis (26 vs 618 skills, 24 overlapping names, 2 naming mismatchesAWS/Vue, the edge-model difference, and the Local-UUID-preservation constraint for existing vaults), and a phased adoption plan whose content-flip step is the human gate.What's gated (not in this PR)
The canonical-artifact generator and the flip of production consumers — because it changes the shipped Local package's skill set and Cloud's production
skill_taxonomy, which moves matching scores and touches existing data. Recommended approach is in the doc; it needs a maintainer to review the generated skill set before adoption.Cloud's matching server-side handshake (
meta.taxonomy = { version, lineage }) ships as a paired PR intraitprint-cloud.Testing
cli.py:38baseline error only).https://claude.ai/code/session_01PsAQUnoLH94f2cbK2dSpox
Generated by Claude Code