Skip to content

Child weights inflated ~46% in June 15 us_2024 build (0cdbb27) — breaks EITC/CTC, concentrated in EITC plateau #64

@PavelMakarchuk

Description

@PavelMakarchuk

Summary

The June 15 2026 populace_us_2024 rebuild (release populace-us-2024-0cdbb27, vended via policyengine 4.17.3 / policyengine-us 1.729.0) over-weights children under 18 by ~46%, which inflates every child-linked tax credit. The June 11 build (populace-us-2024-5da5a95) had the child population correct, so this is a regression introduced in the latest reweighting.

Diagnosed by comparing the materialized 2026 datasets of the two builds. Both have the identical 160,858 person records — so this is purely a weighting problem, not a sample or schema change.

Evidence — weighted population (2026)

June 11 (5da5a95) June 15 (0cdbb27) Census reality
Total population 338.3M 365.6M ~335M
Children <18 73.75M 107.63M ~73M
Adults 18–65 206.0M 201.7M ~205M
Seniors 65+ 58.5M 56.3M ~58M

The new build has ~34M phantom children (+46%); total population is also +27M too high. Adults/seniors are slightly under-weighted, so the reweighting shifted mass into the child bucket.

The excess children are concentrated in the EITC plateau

Children by household earned-income band:

HH earned income June 11 June 15 Δ
$0–15k 7.83M 9.35M +1.5M
$15–30k 10.53M 37.43M +26.9M (+255%)
$30–50k 7.95M 5.89M −2.1M
$50k+ 47.39M 54.94M +7.6M

$15–30k earnings with 2–3 kids is the EITC maximum plateau (~$6–7k/family). Piling +27M children there inflates EITC far more than a uniform child increase would.

Downstream impact (isolated repeal scores, calendar 2026)

Measured on a refundable-credit-conversion benchmark; each is an isolated provision repeal:

Provision June 11 build June 15 build vs JCT FY26
EITC $72.6B $99.7B (+37%) now +48% vs $67.2B (was +8%)
CTC (+ODC) $107.9B $124.8B (+16%) now −3% vs $128.4B
CDCC $4.3B $5.0B (+16%) overshoot widens

EITC rises proportionally more than CTC precisely because the extra children landed at the EITC peak rather than spread across the income distribution.

Likely mechanism

Signature (weight piled onto low-income families with children) is consistent with over-fitting an EITC-by-AGI-and-children claim-control target at the expense of the child-population control. The eitc_by_agi_and_children / eitc_claim_controls targets exist in the calibration target surface; satisfying the EITC-claims constraint by over-weighting $15–30k families with kids would reproduce exactly this pattern.

Suggested checks

  1. Compare the calibration target set + weights between 5da5a95 and 0cdbb27 — was an EITC/child target added, reweighted, or its loss term changed?
  2. Confirm the under-18 population total (~73M) and children-by-AGI distribution are active calibration constraints with adequate weight; the June 11 build satisfied them, the June 15 build does not.
  3. Add a regression guard on aggregate demographics (total population, children <18) so a build that lands 8%/46% off these is flagged before release.

Reproduction

import pandas as pd, numpy as np
import policyengine as pe
# new build (4.17.3 default)
new = pe.us.ensure_datasets(datasets=["populace_us_2024"], years=[2026], data_folder="./data_new")
pe_new = pd.read_hdf(list(new.values())[0].filepath, "person")
w, age = pe_new["person_weight"].values, pe_new["age"].values
print("total pop (M):", w.sum()/1e6, " children <18 (M):", (w*(age<18)).sum()/1e6)
# -> ~365.6M total, ~107.6M children

Compare against the June 11 release (populace-us-2024-5da5a95-20260611): ~338.3M total, ~73.75M children.

Recommendation

Hold consumers on the June 11 build (5da5a95, policyengine 4.16.1) until the child-weight regression is resolved.


Filed from the refundable-credit-conversion validation work. Happy to share the full materialized-dataset diff or rerun any decomposition.

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions