Skip to content

Fix Populace release and export gates#37

Merged
MaxGhenis merged 4 commits into
mainfrom
codex/fix-open-populace-issues-20260614
Jun 14, 2026
Merged

Fix Populace release and export gates#37
MaxGhenis merged 4 commits into
mainfrom
codex/fix-open-populace-issues-20260614

Conversation

@MaxGhenis

Copy link
Copy Markdown
Contributor

Summary

This is the live-Populace cleanup after retiring Microplex/eCPS from the production repo surface. It folds the release contract/latest-pointer/calibration-diagnostics draft PRs into one branch and adds release/export gates that should block bad artifacts before publication.

Fixes #9, #10, #11, #18, #24, #25, and #36.
Refs #27, #28, #29, #33, and #34.
Supersedes draft PRs #12, #13, and #14 once merged.

What changed

  • require and validate build_manifest.json, release_manifest.json, and calibration_diagnostics.json before publication
  • publish latest.json last, validate exact contract paths on read, and reject partial/invalid uploads before touching the Hub
  • reject non-standard JSON constants like NaN in release contract files
  • add calibration diagnostics serialization for per-target rows, loss trajectory, skipped targets, and options
  • make PolicyEngine-US export fail before writing when formula-owned variables reach the output surface
  • add enum-domain validation so PE enum inputs cannot ship raw source codes
  • add export/source-coverage/SPM diagnostic gate helpers for the US target surface
  • support schema-v1 release manifests in TRACE, including per-artifact repo_id
  • update current PE-US support to policyengine-us==1.729.0; keep policyengine-us-data<1.115.4 because 1.115.4 still pins the older engine metadata
  • keep eCPS comparison/parity work out of live Populace; that belongs in PolicyEngine/populace-benchmarks

Validation

  • focused release/TRACE tests: 47 passed
  • Python 3.13: uv sync --python 3.13 --all-packages --all-extras --dev; ruff check .; pytest => 459 passed, 2 skipped
  • Python 3.14: same => 459 passed, 2 skipped
  • installed-wheel CI gate for Python 3.13 and 3.14 => 410 passed, 6 skipped each
  • git diff --check
  • no live references from rg -i "microplex|\\becps\\b|enhanced_cps|enhanced CPS|sound_ecps|mp[-_]?ecps|populace-score-work|score_driver" .
  • /cycle read-only review: final pass had no actionable findings

Notes

This intentionally does not merge PR #21's eCPS parity runner or PR #22's old build_us_candidate.py patch. #21 is benchmark-harness material now, and #22 targets a deleted historical build path; the partnership/S-corp values work should be ported to the current Populace build/profile surface separately.

PavelMakarchuk and others added 4 commits June 14, 2026 14:46
…e producer

The releases already on policyengine/populace-us disagree with each other:
1abddeb has no build_manifest.json and an unversioned release_manifest
schema, while later releases carry schema_version 1. Every consumer was
left to implement its own defensive filter.

validate_release_dir() is the single gate: required files, manifest schema
version, and build-id agreement between both manifests and the directory
name — raising ReleaseContractError that names every violation at once.
Verified against the real 9f1260b release (passes) and the real 1abddeb
release (rejected with six named failures).

Fixes #11.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The Hub repo had no way to say which release is current: consumers listed
the releases/ tree and guessed, and same-day builds (the ids end in a
date) have no defined ordering.

publish_release() validates the release directory against the contract
(an invalid release uploads nothing), uploads its files, and writes the
latest.json pointer LAST so a reader never sees a pointer to files that
are not there yet. latest_release() is the consumer side: one call that
returns the typed pointer. The Hub client is injected so the suite tests
the real ordering against a fake, network-free.

Fixes #9. Stacked on #11's contract branch.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…he artifact

CalibrationResult already carries everything an auditor needs — per-target
initial/final estimates and tolerance verdicts, the per-epoch loss
trajectory, every skipped target with its reason, and the solver options
actually used — but none of it left the build machine: the build pushed
diagnostics to telemetry and dropped them, and the published .npz kept
only closing scalars. "Skipped and reported, never dropped silently" is
only true if the report ships.

diagnostics_payload() renders the result as strict JSON (non-finite
floats become null; the writer passes allow_nan=False so anything that
escapes the scrub fails loudly), and write_calibration_diagnostics()
writes the calibration_diagnostics.json a release directory publishes
next to its manifests.

Fixes #10.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Publish a latest.json release pointer on policyengine/populace-us

2 participants