Skip to content

Formula-owned SSI export pins SSI in PolicyEngine simulations #18

@MaxGhenis

Description

@MaxGhenis

Summary

The published US populace artifact exports person.ssi, even though ssi is formula-owned in PolicyEngine-US. When PolicyEngine loads the HDF5, USSingleYearDataset flattens every table column into inputs and Microsimulation sets any column that matches a PolicyEngine variable. That turns ssi from a computed output into a pinned input, so SSI reforms can fail to change final SSI.

Why this matters

This surfaced while rerunning SSI asset-limit analysis against policyengine/populace-us populace_us_2024.h5. The reform appeared to have zero cost because the published ssi column overrode the formula path.

A direct check confirmed the issue:

  • with the published artifact as-is, the SSI reform cost was 0.0
  • after clearing the loaded ssi input arrays in the simulation, the same reform cost was about $8.7B

So this is not just an unused extra column; it changes reform behavior.

Expected behavior

Populace exports should not persist non-structural PolicyEngine variables that are owned by formulas. Those variables should be omitted so the rules engine calculates them from source inputs under baseline or reform policy.

A static formula_owned_excluded list is not enough: the export path should dynamically check the installed PolicyEngine variable registry and drop/fail any present formula-owned variable, while preserving structural ID and membership columns.

Observed affected US columns

In the published populace_us_2024.h5, the following non-structural exported columns are formula-owned under the current PE-US registry:

  • employment_income_last_year
  • has_itin
  • has_tin
  • household_size
  • in_nyc
  • is_adult
  • is_child
  • is_senior
  • spm_unit_capped_work_childcare_expenses
  • ssi
  • taxable_unemployment_compensation
  • traditional_ira_contributions
  • weeks_worked

The critical regression identified so far is ssi.

eCPS comparison

I checked local eCPS-style files as references. ssi was not present in either checked eCPS input file, so this specific bug is not caused by eCPS parity.

However, the published populace US artifact also contains many extra known PE variables relative to those eCPS files, which suggests the current export path does not enforce a closed eCPS input surface.

Related UK finding

A local UK populace candidate artifact has the same class of problem: non-structural formula-owned PE-UK variables were present in the saved H5. Examples include current_education, state_pension_reported, student_loan_repayments, and rail_subsidy_spending.

Fix direction

Add export-time guards in populace so that:

  1. formula-owned engine variables are dynamically identified from the live PolicyEngine registry and omitted/rejected before save;
  2. structural columns such as entity IDs and memberships are exempted;
  3. contracts can optionally enforce a closed allowed column surface for eCPS-style exports;
  4. regression tests cover ssi specifically.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions