Fix PUF real-half QRF imputation#1170
Conversation
Rebased onto current main and integrated with the Forbes/top-tail training exclusion that landed since the original branch point: - X_train_real now flows through the same non_forbes_mask filtering as X_train_full/X_train_override, so synthetic Forbes and metadata-missing top-tail donors stay out of the weighted real-half training set; the positive-weight filter moves below the Forbes block to keep the person-level mask index-aligned. - TestForbesTrainingExclusion fakes and assertions updated for the four sequential-QRF calls (clone full/override + real full/override). - Adds the towncrier fragment. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
fa0a141 to
82cb3a4
Compare
|
Rebased onto current main ( Test integration:
Validation: 🤖 Generated with Claude Code |
Satisfies the PolicyEngine US freshness gate, which fails repo-wide while the lock pins 1.715.3. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Summary
Validation
uv run ruff check policyengine_us_data/calibration/puf_impute.py tests/unit/calibration/test_calibration_puf_impute.py validation/puf_qrf_real_half_smoke.pyuv run pytest tests/unit/calibration/test_calibration_puf_impute.pyuv run pytest tests/unit/test_puf_impute.py tests/unit/test_extended_cps.py tests/unit/calibration/test_retirement_imputation.pyuv run python -m compileall -q policyengine_us_data/calibration/puf_impute.py tests/unit/calibration/test_calibration_puf_impute.py validation/puf_qrf_real_half_smoke.pyuv run python validation/puf_qrf_real_half_smoke.pySmoke result: raw PUF weighted combined charitable is $236.4B; old unweighted demographic-only QRF gives $15,523.1B (65.7x); fixed weighted income-conditioned QRF gives $357.0B (1.5x).
Notes
puf_2015.csv,demographics_2015.csv, andcps_2024.h5because the local dataset-backedMicrosimulationpath currently fails on stale CPS schema/shape issues.