Impute stochastic qualified-dividend shares for unsplit dividend totals

Follow-up to #202 and part of the broader target-surface cleanup in #200.

#202 fixes the immediate inversion by applying a documented constant fallback: when a record has only a dividend total and no observed qualified/non-qualified components, split it 78% qualified / 22% non-qualified based on the 2015 PUF aggregate E00650/E00600 share. That is a good first-order patch, but it gives every unsplit CPS dividend row the same qualified share.

We should replace that constant fallback with a stochastic or modeled `qualified_dividend_share` imputation learned from PUF rows with observed dividend composition.

Suggested shape:

- Train/impute `qualified_dividend_share = qualified_dividend_income / ordinary_dividend_income` from PUF donor rows where ordinary dividends are positive and the qualified/non-qualified split is observed.
- Apply the imputed share only to rows with an unsplit positive dividend total and no observed components, e.g. CPS `DIV_VAL`-only rows.
- Preserve each row's total dividend exactly: `qualified + non_qualified == ordinary_dividend_income == dividend_income` up to numerical tolerance.
- Keep observed PUF component rows unchanged.
- Make the stochastic draw reproducible via the pipeline seed/checkpoint metadata.
- Prefer conditioning on relevant predictors if available, such as dividend amount, income/AGI proxies, age, filing/tax-unit features, and asset/investment indicators.

Validation target:

- Rebuild or run a focused diagnostic showing the qualified/non-qualified split moves toward the SOI/eCPS evidence without breaking export support parity.
- Report national weighted totals and filer counts for `qualified_dividend_income`, `non_qualified_dividend_income`, and total dividends before/after.
- Confirm this does not reintroduce the old all-non-qualified CPS-spine failure.

This should be treated as a quality improvement after #202, not a reason to block the constant-share bug fix.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Impute stochastic qualified-dividend shares for unsplit dividend totals #203

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Impute stochastic qualified-dividend shares for unsplit dividend totals #203

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions