Skip to content

statscan profile is a config dead-drop; imputation_health target; universe-nested predictors #8

@DougManuel

Description

@DougManuel

From the PR #6 silent-failure review — three compounding problems with the statscan profile:

  1. config/statscan.yml is loaded into cfg$local_config, which nothing reads — every key in it (RDC paths, cycles) is silently ignored.
  2. The profile never sets data_source: master, so survey_var() falls back to pumf resolution at the RDC: PUMF variable names, PUMF bounds (age max 85 not 110), PUMF rows of the variables sheet.
  3. On a machine where the default paths exist, the statscan profile runs PUMF data while claiming to be a Master configuration, with zero complaint.

Fix: set data_source: master in the profile; merge statscan.yml keys to the top level (or read local_config explicitly); add a startup assertion that profile name and data_source agree. Also from the review: add a failing imputation_health target that consumes analysis_data$logged_events (a warning shown once under {targets} caching is effectively silent — a red target is not), and consider the universe-nested smoking predictor refinement for the imputation predictor matrix (structural-NA variables currently excluded as predictors entirely; nesting would recover information, e.g. quit-time predicting initiation age among former smokers only).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions