-
Notifications
You must be signed in to change notification settings - Fork 2
Feat/cmip7 awiesm3 veg hr #266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
JanStreffing
wants to merge
310
commits into
prep-release
Choose a base branch
from
feat/cmip7-awiesm3-veg-hr
base: prep-release
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
310 commits
Select commit
Hold shift + click to select a range
ac18e11
Add SLURM batch script for core2 ocean test
JanStreffing ca35a25
Update XIOS XML: add u10m/v10m aliases, move evspsblpot/sbl to daily
JanStreffing da64d75
Add scale_and_integrate_pipeline, rules for hfyint, hfy, hfxint, hfx.
nwieters a317c28
Fix YAML configs: GLB casing, regex patterns, XIOS time_dimname
JanStreffing dfca6fe
Fix fx variable handling, XIOS time in secondary inputs, add shuffle …
JanStreffing 59df215
Add TCo95 core_atm test config and SLURM batch script
JanStreffing 7afa539
Added rule for 3hourly variable tauuo and tauvo in cap7_ocean.
nwieters cc90635
Update cap7_ocean todo file.
nwieters 073c355
Fix units='-' parsing, time dimension handling, add year filtering an…
JanStreffing 0d94c38
Add core_seaice test config and fix simass unit conversion
JanStreffing e8c9f28
Added msftm, msftmmpa (depth, density) variables to lrcs_ocean.
nwieters 7d4de98
Added sfx, sfy variables.
nwieters ea2b9e5
Update lrcs_ocean todo.
nwieters 3ed2fa1
Add rule for daily variable hfx and hfy.
nwieters cc900d8
Add lrcs_seaice test config, fix OOM and siflcondtop time mismatch
JanStreffing 534ff40
Add lrcs_ocean CORE2 test config, fix msftbarot name bug and PSU units
JanStreffing aae7e72
lrcs_land test passes 6/6, update all test configs to cmip7_output_006
JanStreffing 6245794
Add cap7_seaice test config (9/9 pass), fix unit conversions
JanStreffing f14bf13
Add veg_seaice test config (1/1 pass), fix sisnhc compound name
JanStreffing 94480ba
Add veg_land test config (56/56 pass), fix LPJ-GUESS yearly save and …
JanStreffing 1d6017a
Add veg_atm (26/26) and extra_land (13/13) test configs, fix bugs
JanStreffing d3434b1
Add extra_atm test config (21/21 pass), fix second_input_pattern glob…
JanStreffing c069007
Add lazy_write + streaming-write optimization, fix dask-backed time c…
JanStreffing c1d6eaf
Add cap7_atm CMIP7 config and lazy_write across 3D rules
JanStreffing 2f3fff1
Added rule for siconca (daily, monthly). It uses oifs output for this…
nwieters 8269e99
Added rule/pipeline for Snow Sublimation over sea ice. Uses oifs data…
nwieters fa28ea4
Add rule for daily siarea (north/southe).
nwieters 54e9a30
Added daily siextent (noth, south) rule. Added threshhold option in i…
nwieters 286331b
Add rule for constant variable sidragtop.
nwieters d213b4d
Add rule for sifllattop, siflsenstop.
nwieters 00fc070
Add rule for variables daila, monthly sisnmass (North, South)
nwieters 582ba98
Add rules for daily sivol (North, South) variables.
nwieters 9a5bf81
Add rule for monthly rlds. Using iofs variable, regrid to fesom2 and …
nwieters 62eadae
Add rules for rlus, rsds, rsus. Take vriable from oifs, regrid and ap…
nwieters 51b95c8
Add table about status of cmorization coverage.
nwieters 6afc320
Add cap7_ocean test config (9/9 pass), implement heat_transport pipeline
JanStreffing cbf604f
Standardize test config/script names, remove ad-hoc tests
JanStreffing db25714
Enable FESOM tracer flux diagnostics (utemp, vtemp) for heat transport
JanStreffing 008489b
Move time_dimname into inherit in veg_land HR config
JanStreffing f861eff
Add clear-sky radiation and snm rules to LR tests
JanStreffing 91f127d
Add opottemprmadvect rule to lrcs_ocean LR test
JanStreffing aeb2f7f
Add hfx.day/hfy.day rules to lrcs_ocean LR test
JanStreffing 3620076
Add cap7_aerosol/cap7_land tests, fix LPJ-GUESS unit parsing
JanStreffing 44a3fb3
Fix thdgrsn -> thdgrsnw, document FESOM io_meandata patches
JanStreffing 970a899
Fix FESOM sign conventions for evspsbl and mlotst
JanStreffing 62f063a
Drop od550aer: MACv2-SP is anthropogenic-only, not total AOD
JanStreffing f8c37de
Point LR tests at cmip7_output_013; update HR comments for single-str…
JanStreffing dc6f0de
Add tripyview to [fesom] extras for MOC/MHT diagnostics
JanStreffing 9e1d9e2
Convert XIOS field_def_cmip7.xml to Jinja2 template; align spinup fil…
JanStreffing 974e653
Update variable coverage table.
nwieters ecc1efa
Correct status of atmos variables in variable coverage table.
nwieters 434f58c
Added rules for some GHG variables.
nwieters db580bb
Add rule for CH4 GHG.
nwieters 77f7f65
Update CAP7_aerosol todo, and coverage.md
nwieters 3d62f3e
Update variable_coverage.md
nwieters 81bc62a
Update variable_coverage.md
nwieters 34a21a2
Update variable_coverage.md and aerosol todo.
nwieters 6f7be8b
Drop deaccumulate_ifs: flux inflation was a XIOS freq_op misconfig, f…
JanStreffing d87c373
cap7_atm: reconstruct hur on model levels from ta+hus+pfull (OpenIFS …
JanStreffing efc8b86
Point LR tests at cmip7_output_017 / year 1909
JanStreffing 19b052d
Move basin/msftmz/hfbasin/sltbasin rules into core_ocean and lrcs_oce…
JanStreffing 87aec33
Fix log-file race; repoint test patterns to year 1909; add repoint_te…
JanStreffing 61be313
Update todos and coverage file.
nwieters c342f5d
Fix pint scaling-factor units (sos/so), non-monotonic time resample, …
JanStreffing e567bcc
compute_msftmz: align w vertical dim with tripyview diag file convention
JanStreffing 441b66d
CMIP7 QC compliance: emit branded globals, DRS filename, clean encodings
JanStreffing 0b86d3f
doc: design note for pipeline-integrated CMIP7 QC workflow
JanStreffing 57dcbca
CMIP7 QC: emit lat/lon bounds, cell_measures, midpoint timestamp, DRS…
JanStreffing ec78f24
xios cmip7 XMLs: add ta/hus on p19, switch to reduced Gaussian, aggre…
JanStreffing b9c94f3
Attach lat/lon bounds from mesh for unstructured output
JanStreffing d9f7dec
pycmor output: CF-compliant bounds encoding and time:units_metadata
JanStreffing 4db542e
pycmor CMIP7 global attributes: emit CV-valid values
JanStreffing a430fde
pycmor output: emit external_variables for cell_measures (CF 7.2)
JanStreffing d5e3ed5
pycmor std_lib: reusable cell_measures steps + fx FrozenPipelines
JanStreffing 9b191ee
std_lib cell_measures: AreacellaFxPipeline reads model output, not mesh
JanStreffing 266e472
configs: migrate areacello/areacella pipelines to std_lib FrozenPipel…
JanStreffing b505886
HR/LR yaml sync + repoint for test runs
JanStreffing 0a1f79b
HR core_atm sbatch wrapper + CAP7 data volume estimate
JanStreffing 9bff49e
std_lib + configs: CMIP7 fx/flag/unstructured-grid fixes
JanStreffing 0700690
timeaverage: default flox engine to numpy, override knob
JanStreffing 975ea11
std_lib: pluggable netCDF codec + write scheduler knobs
JanStreffing 1fba182
pycmor output: three HR-atmos fixes (dask _FillValue, XIOS bounds, co…
JanStreffing 00478b0
bounds recovery: re-read XIOS bounds from input files at save time
JanStreffing 29dc348
bounds variables: drop the spurious ``units`` attribute
JanStreffing 8443aeb
bit-level quantization: default BitGroom-5, skip bounds / coord vars
JanStreffing 6959c2e
add wip verification
JanStreffing a61be16
add full hr test runner script
JanStreffing 0c31f9b
update HR yamls with performance optims and removal of double pipelin…
JanStreffing 09c48bc
env checker to ensure we run on env with hdf5 and nc from hpc
JanStreffing ae3a92f
add step rechunk_time
JanStreffing a6b229e
update all example scripts for lr and hr benchmarking
JanStreffing 743aee7
script:// step paths: support env-var expansion, migrate yamls to $PY…
JanStreffing 1927d89
feat(land): add 75 LPJ-GUESS CAP7/VEG/EXTRA rules and depth/pool loaders
39afd78
ocean: rewire uos / vos onto unod_sfc / vnod_sfc, drop 3hr tauuo / ta…
JanStreffing fb695bd
oifs: sync xios xml from esm_tools (file_def, context_ifs, grid_def)
JanStreffing 57e4add
oifs: drop redundant + unrequested OIFS streams; retarget pycmor rule…
JanStreffing 6475250
extra_land: drop old composite c3PftFrac rule + lpjg_monthly_sum_pipe…
JanStreffing 47b1154
extra_land: drop 5 more old PFT-fraction rules colliding with the LPJ…
JanStreffing 65945ef
inherit: switch netcdf_write_scheduler from threads to synchronous on…
JanStreffing a0866d4
ocean: implement compute_msftm_density / compute_msftmmpa_{depth,dens…
JanStreffing afedbab
doc: add sanity-check ranges for 53 newly-active HR rules
JanStreffing d1ce806
cap7_aerosol: cmor 1 model year of GHG, not 273 forcing years
JanStreffing 59cfa1e
xios: drop duplicate _<suffix>_ infix from OIFS output filenames
JanStreffing 0b3501d
xios dedup followups: 11 second_input_pattern + 3 extra_land per-var …
JanStreffing 08d09c2
extra_land: fix singular pipeline: -> pipelines: on 6 LPJ PFT rules
JanStreffing f1d4c9d
tooling + cap7_atm fixes
JanStreffing cd5f5d2
fix: pipeline-key typos, prefix typos, generic-pattern updates across…
JanStreffing a103070
fix: 2 quoted-pattern dups + lrcs_land static per-var
JanStreffing 0551d1b
fix: FESOM *_file: regex-form -> glob-form + repoint glob support
JanStreffing 48fa47f
lrcs_seaice: replace regrid_regular_to_fesom with regrid_oifs_to_fesom
JanStreffing 2508561
naming: drop cap7/cmip7 from XIOS file_ids; merge collisions
JanStreffing ff183a1
examples LR test yamls: repoint to LR_run_test3
JanStreffing f3b45d5
tools: add sanity-check walker + report generators
JanStreffing 8e99422
issues_y1587_sanity.md: refresh with all 695/695 files
JanStreffing df15285
fix unit handling and compute bugs in fesom-derived rules
JanStreffing f98e394
sanity_check_ranges: widen bounds with literature support
JanStreffing 74a1616
disable fesom diagnostics that are dead under linfs ALE
JanStreffing a37195d
zg: do divide-by-g in xios, drop rule-side scale_factor
JanStreffing f61f01b
slab-loop save path + memory-pressure bench artifacts
JanStreffing 8aa8eb3
HANDOFF: update with pyconcat retry numbers + revised throughput math
JanStreffing a65cd13
HANDOFF: append mode is the recommended default
JanStreffing ebecb51
slab loop: fix multi-time-axis append, add chunk-count guard, input f…
JanStreffing f8bc0ae
slab loop: detect aux time dims by name/CF attrs, not size match
JanStreffing cd8341f
slab loop: skip when aux time dim has size != primary time
JanStreffing 3174c79
slab loop: tighten multi-time-axis guard to also skip parallel-axis case
JanStreffing 5d90ccd
slab loop: rebalance slab boundaries + override time-axis chunksize
JanStreffing 1c7f2d7
Revert slab-loop save path: append-mode silent truncation
JanStreffing 7338ba7
HANDOFF: investigation closed with negative result + cgroup-v2 watchdog
JanStreffing 0e76600
recipe: fix rtmt_mon formula and snd_mon time-coord alignment
JanStreffing 2cc43f6
README: align compute_rtmt docstring with the corrected formula
JanStreffing 1cb45e4
Round 1 closed (negative): h5netcdf and inline_array both regress
JanStreffing 3169ce5
Round 2 closed (negative): Prefect task collapse no measurable win
JanStreffing 39a0c79
Round 3 audit: all four candidates dead or unverifiable
JanStreffing a6ff822
Round 4: contention sweep on mini-cap7 finds 3x4x48 as new default
JanStreffing d280114
fix _HLGExprSequence pickle bug in save under prefect-dask
JanStreffing 8d5340d
Add save_dataset heartbeat + design proposal for subflow deadlock fix
JanStreffing a41103a
HR submit: new default 4x4x16 + collapse (27% wall reduction)
JanStreffing e8aa226
Production tuning: thread/mem-limit pass-through + parallel cleanup +…
JanStreffing 1d2e1b1
fix cap7_atm rule inputs: add 2d to file_def, year-lock secondary pat…
JanStreffing fcd43cc
Eliminate parent×subflow deadlock; fix gate-A infra flakes
JanStreffing 6773ea5
Throttle parent submission to W*TPW via as_completed rolling window
JanStreffing 90f382f
Throttle parent submission also in _parallel_process_prefect
JanStreffing a30a963
DESIGN_PROPOSAL §10.6: gate-A v2 final + new failure mode + throttle
JanStreffing 8046000
Add CLI overrides for run-root, year range, mesh, output, slurm memory
JanStreffing 55bb37e
Fix CLI-override regressions vs repoint_hr_year.py (R1, R2)
JanStreffing 961c492
Migrate *_file: literal globs to *_path:/*_pattern: form
JanStreffing 42e8a6a
Recipe fixes post-CLI migration: F1+F3+F4+F5+F6 (cli5 545/11)
JanStreffing 6ef5eca
sifllattop / siflsenstop: switch to atmos_1h_sfc_hfls/hfss
JanStreffing 53eb778
mask_where_no_seaice: fix duplicate-time + bounded-memory broadcast
JanStreffing 7091733
sanity_check: add HTML report generator + Test_06 results
JanStreffing 5ba2f33
sanity_check: add per-variable map plots to HTML report
JanStreffing b885674
sanity_check/build_maps: fix dim detection + file picker
JanStreffing 46058d3
sanity_check/build_html_report: long names + description, fix PICONTR…
JanStreffing 9bdd870
sanity_check/build_maps: 3-panel maps + level-name fix
JanStreffing 54c7b28
sanity_check: split PICONTROL_NONZERO from PHYS_NEG_VALUES
JanStreffing 932d9a0
sanity_check: time-series plots for hemispheric/global scalars
JanStreffing cb0de4c
sanity_check html: widen layout so 3-panel maps render large
JanStreffing 799d9ab
fix snm sign / vsfcorr fill / siarea unit + handoff doc
JanStreffing 7aea762
build_maps: walk through level dim to find a non-NaN slice
JanStreffing 5170e3c
sanity_check: order-of-magnitude FAIL + Baltic-aware salinity bounds
JanStreffing 0644699
build_maps scatter: wrap lon to -180..180 before plotting
JanStreffing 803a150
sanity_check: PICONTROL classifier narrowed to anthropogenic-flux only
JanStreffing d72c0b2
Apply rho_0-family scale to depth-integrated ocean rules
JanStreffing 8bbeb68
sanity_check maps: regen with scatter + lon-wrap (333/335)
JanStreffing 7e5e3c4
sanity_check + dsn: EXTREME_OUTLIER tier; fix dsn double-conversion bug
JanStreffing 9fec294
fix sea-ice conductive/turbulent flux sign convention
JanStreffing c36726a
sanity_check: scale-too-small check (observed range << expected envel…
JanStreffing 1bcce1d
sanity_check: tighten scale-too-small threshold to 10x
JanStreffing 663401c
sanity_check html: rename Ice / move landIce -> Land page
JanStreffing a766437
reports: re-walk Test_06_cli_y1587_v7 after cli9 rule fixes
JanStreffing 1db0265
handoff: test06_cli9 results + 5 follow-up items
JanStreffing e0e93ee
sanity_check: per-file diagnosis + hemisphere-integral unit-conversion
JanStreffing 8005217
sanity_check index: report both per-variable and per-file counts
JanStreffing a53979f
sanity_check: per-file cards (one card per .nc file)
JanStreffing 2a789b7
sanity_check: per-file maps + compute-node SLURM script
JanStreffing 5e255a3
sanity_check html: re-render with per-file maps from compute node
JanStreffing ca6068a
sanity_check: widen areacella for HR, fix ua mean, skip checks for wi…
JanStreffing 08ad0ad
reports: refresh 17 stale Test_06 entries + reclassify areacella/ua
JanStreffing c07a81a
sanity_check ranges: widen fFire/fFireAll/fFireNat bounds
JanStreffing d90fea5
sanity_check: cadence-aware bounds (var_<cadence> fallback to plain var)
JanStreffing 563248e
sanity_check ranges: widen HR DGVM carbon-pool bounds with literature…
JanStreffing ad63293
sanity_check ranges: widen difvho/difvso bounds to include convective Kv
JanStreffing 244a6e1
sanity_check ranges: widen siflcondtop and siflfwbot for HR thin-ice …
JanStreffing 8507e69
hfbasin: replace per-element-area approximation with tripyview edge-c…
JanStreffing 9a6aec5
save_dataset: tmpfs-staged atomic write (Option A + A.5)
JanStreffing 87bc1fe
save_dataset: watchdog timeout + retry (Option E)
JanStreffing 7d49257
sisnmass NH/SH: apply rho_snow scaling before unit conversion
JanStreffing 9bf31f1
integrate_over_hemisphere: replace fancy isel with mask-and-multiply
JanStreffing 9ab2cd2
save_dataset watchdog: clarify the log message is informational
JanStreffing 17a4cf6
save_dataset: watchdog no longer raises -- diagnostic only
JanStreffing dbf3a08
sltbasin: port to tripyview edge-crossing integration
JanStreffing 6a5c67c
sanity_check ranges: hfx/hfy bounds reflect per-cell not basin-integr…
JanStreffing d6233d0
lrcs_seaice failure fixes: pipeline throttle_group + secondary-mf lru…
JanStreffing 3604c53
save_dataset: move compute off driver onto cluster workers (Fix #3)
JanStreffing 341ddee
shard isolation: SLURM-level per-shard yamls, 1 array per tier
JanStreffing 5973a3f
shard runner: 512G cgroup, fix3-off default, graph-size instrumentation
JanStreffing 82aea52
runner+submitter: per-tier --mem override, default back to --mem=0
JanStreffing e452940
submitter: per-tier Fix #3 override via SHARD_FIX3 env
JanStreffing fb639fa
_safe_to_netcdf: retry transient compute errors
JanStreffing cfb9320
_safe_to_netcdf: gc.collect + malloc_trim between saves to fight glib…
JanStreffing c855236
submitter: lrcs_seaice gets Fix #3 ON; LRCS_SEAICE_MEM env knob
JanStreffing cd99b57
Revert "submitter: lrcs_seaice gets Fix #3 ON; LRCS_SEAICE_MEM env knob"
JanStreffing a426963
Revert "_safe_to_netcdf: gc.collect + malloc_trim between saves to fi…
JanStreffing 7e2f49f
Revert "_safe_to_netcdf: retry transient compute errors"
JanStreffing 76b1d47
_process_rule: @task(retries=3, retry_delay_seconds=30)
JanStreffing ea564dd
_safe_to_netcdf: gc.collect + malloc_trim between saves
JanStreffing 0c4aff1
submitter: lrcs_seaice gets 6h walltime; others stay 3h
JanStreffing 821f0eb
Revert "_process_rule: @task(retries=3, retry_delay_seconds=30)"
JanStreffing 8815440
_process_rule: manual whole-rule retry on transient dask errors
JanStreffing 2d705fa
submitter: extra_atm runs with N_WORKERS=3 (was 4)
JanStreffing 5c955bf
lrcs_seaice runs under jemalloc to bound malloc fragmentation
JanStreffing 9dfcf24
submitter: jemalloc on for all tiers (was lrcs_seaice only)
JanStreffing cc19fc6
submitter: cap7_atm switches to SHARD_FIX3=off (regression A/B)
JanStreffing 338ec44
submitter: extra_atm flips to fix3=off; drop tier_workers override
JanStreffing 7900daa
load_lpjguess_*: vectorize df.iterrows() to fix dask heartbeat deadlock
JanStreffing fba441c
submitter: extra_atm joins lrcs_seaice on --mem=512G
JanStreffing 9f0e520
filecache: vectorize timestamp parsing in select_range/validate_range…
JanStreffing 9ba14cb
submitter: drop lrcs_seaice 6h walltime override, use global 3h
JanStreffing d7d375e
submitter/runner: SHARD_DRS=on emits full CMIP DRS output tree
JanStreffing fc0d215
sanity_check: fix per-file bounds display + recurse into nested cmori…
JanStreffing 900c575
sanity_check: report fixes + cli37 report; transport: kg/s edge width
JanStreffing 5667ab2
seaice: mask hxy-si rules, fix sisaltmass and sidmasstran[xy], attach…
JanStreffing d4ce88d
sanity_check: derive realm from file's :realm attr, split cards per r…
JanStreffing 775a1b0
cap7_land: disable 8 LPJ-GUESS rules that duplicated core compounds
JanStreffing e7306e2
ocean: fix hfds sign, msftbarot equator band, wmo vertical area, hfba…
JanStreffing e332044
sanity_check_ranges: tune difmxylo, sfdsi bounds (cli37 Christian rev…
JanStreffing b706941
cap7_land: add vegHeightGrass_mon rule (hxy-grs)
f87e49a
sanity_check_ranges: hfds bounds for post-sign-flip recipe
JanStreffing 80fe7db
sanity_check_ranges: widen hfbasin to ±10 PW for HR monthly extremes
JanStreffing 8b19d2c
sanity_check_ranges: relax 5 land bounds per Laszlo/Christian round 2
JanStreffing 51789a6
veg/cap7 land: clip-noise + clip-floor steps + treeFrac total yearly …
JanStreffing 2fd7241
veg_land sanity-check round 1: plan + D4 handoff + reviews
JanStreffing c88c0e2
wo: interface→midpoint averaging fixes "clean top, noisy below" pattern
JanStreffing 470641b
cap7_land: add yearly rules for per-PFT treeFrac{BdlDcd,BdlEvg,NdlDcd…
465afa4
sanity_check_ranges: loosen cli37 piControl bounds for fAnthDisturb, …
1dbb5e7
xios cmip7 RH: wire send_cmip7_rh.F90 outputs into field_def/file_def
JanStreffing 7557ac4
lrcs_seaice: remove redundant mask_where_no_seaice from 13 FESOM-nati…
JanStreffing 35735fc
CMIP7 DReq: bump v1.2.2.2 -> v1.2.2.3
JanStreffing d044b36
veg_land: drop 4 PFT-yearly treeFrac rules — not in CMIP7 data request
JanStreffing 07390ec
submit_hr_year_shards: extra_land shard size 20 → 5
JanStreffing 5a61ad3
lrcs_seaice, veg_land: force serial rule submission via throttle_group
JanStreffing 3299a99
throttle_caps: yaml path dead, use PYCMOR_THROTTLE_CAPS env var instead
JanStreffing 9d4579c
cmorizer: rule.throttle_group as fallback for unpipelined rules
JanStreffing 4fdade1
fesom-ingest tiers: tighten input patterns to native-only (\d{4})
JanStreffing 3d47d27
launcher: optional gr-grid variant via WITH_GR=yes
JanStreffing 1a3b3e6
core_seaice, cap7_seaice, veg_seaice: serial throttle
JanStreffing daef954
submit_hr_year_shards: fix WITH_GR fesom detection + gr throttle
JanStreffing 517aeae
generate_gr_yaml: drop rules using FESOM-mesh-dependent pipelines
JanStreffing 5779fcb
generate_gr_yaml: expand mesh-pipeline filter + drop rules with no gr…
JanStreffing 5ea55d8
generate_gr_yaml: fix step substrings, add pipeline-name filter
JanStreffing File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,242 @@ | ||
| # Design proposal: drop literal-glob `*_file:` in favor of `*_pattern:` form | ||
|
|
||
| ## Context | ||
|
|
||
| R1 of [PLAN_cli_override_regressions.md](PLAN_cli_override_regressions.md) | ||
| added an `*` → `<year>` expansion inside `apply_overrides` so existing yaml | ||
| entries like | ||
|
|
||
| ```yaml | ||
| aice_file: /work/.../outdata/fesom/a_ice.fesom.*.nc | ||
| ``` | ||
|
|
||
| still resolve to a real file after we removed `repoint_hr_year.py`. The | ||
| expansion is a workaround, not a fix: it recreates repoint's regex rewrite | ||
| inside the CLI layer instead of removing the underlying yaml smell. | ||
|
|
||
| This proposal: migrate all 10 affected entries to the regex-pattern form | ||
| that secondary inputs already use, and remove R1. | ||
|
|
||
| ### Forcing function: 1700-year cmorization in multi-year chunks | ||
|
|
||
| Upcoming workload is to cmorize **1700 simulation years**, processed in | ||
| chunks of multiple years per pycmor run (not one-year-at-a-time). With | ||
| FESOM's typical one-file-per-year naming (`<var>.fesom.<year>.nc`, | ||
| verified at e.g. | ||
| `/work/ab0246/a270092/runtime/fesom-2.7/ice_strength/run_19600101-19601231/` | ||
| and the existing y1587 archive), each chunk needs `open_mfdataset` over | ||
| N files. | ||
|
|
||
| **R1's literal-glob form structurally cannot represent this.** It | ||
| requires `--year-start == --year-end` and raises `OverrideError` | ||
| otherwise. The 1700-year run hits that error on the first chunk that | ||
| spans more than one year — which is most of them. The migration is a | ||
| hard prerequisite for the upcoming workload, not a stylistic cleanup. | ||
|
|
||
| --- | ||
|
|
||
| ## Does globbing affect years? Yes — it locks single-year only | ||
|
|
||
| The R1 expansion takes `--year-start` and substitutes it for the literal | ||
| `*` in `*_file:` values. That's the **only** year handling these rules | ||
| get. Three properties fall out of that: | ||
|
|
||
| 1. **Year is injected, not detected.** The expansion blindly writes the | ||
| CLI year into the filename. There's no check that the resulting file | ||
| actually exists; runtime is the first place a typo or missing file | ||
| surfaces. | ||
|
|
||
| 2. **Multi-year is structurally impossible.** `xr.open_dataset(literal_path)` | ||
| takes one file. R1 raises `OverrideError` for `--year-start != | ||
| --year-end` because there's no way to expand a single literal `*` to | ||
| multiple files inside a single string. Users hitting a multi-year | ||
| range get a migration message — but that migration is exactly what | ||
| this proposal does. | ||
|
|
||
| 3. **`skip_input_year_filter` does nothing for `*_file:` consumers.** R2 | ||
| gates `_filter_files_by_year_range` at both call sites, but `*_file:` | ||
| resolution doesn't go through either path — it's a literal `open_dataset` | ||
| call. Centennial-forcing rules can't use a `*_file:` form. | ||
|
|
||
| Compare with the pattern form: | ||
|
|
||
| ```yaml | ||
| aice_path: /work/.../outdata/fesom | ||
| aice_pattern: a_ice\.fesom\..*\.nc | ||
| aice_variable: a_ice | ||
| ``` | ||
|
|
||
| - File list comes from the directory + regex. | ||
| - `filter_files_by_year_range` narrows by `year_start`/`year_end` — | ||
| **range, not equality**. | ||
| - `open_mfdataset` handles 1+ files transparently. | ||
| - `skip_input_year_filter: true` opts out cleanly. | ||
|
|
||
| So globbing in `*_file:` form is a year-mangler in disguise. Removing it | ||
| removes a hidden coupling between yaml syntax and CLI semantics. | ||
|
|
||
| --- | ||
|
|
||
| ## Affected entries (audit) | ||
|
|
||
| ``` | ||
| $ grep -rn 'fesom\.\*\.nc' awi-esm3-veg-hr-variables/ | grep '_file:' | ||
| ``` | ||
|
|
||
| | Tier | Rule | Key(s) | | ||
| |---|---|---| | ||
| | `lrcs_seaice` | sispeed | `aice_file`, `vice_file`, V-component file | | ||
| | `lrcs_seaice` | sidmasstranx, sidmasstrany | `aice_file`, `vice_file` | | ||
| | `lrcs_seaice` | sistressave, sistressmax | `aice_file` (or similar) | | ||
| | `lrcs_seaice` | siflcondtop, sifb, sihc | (single fesom file each) | | ||
| | `lrcs_seaice` | simpeffconc | (single fesom file) | | ||
| | `lrcs_seaice` | sispeed_day | per-day equivalent | | ||
| | `core_ocean` | zostoga | (fesom 3D file) | | ||
|
|
||
| Exact key names per rule need a finer audit before migration. Static-mesh | ||
| keys (`grid_file`, `basin_mask_file`) and any `*_file:` value without a | ||
| literal `*` are unaffected. | ||
|
|
||
| --- | ||
|
|
||
| ## Migration mechanics | ||
|
|
||
| ### Yaml side | ||
|
|
||
| For each entry: | ||
|
|
||
| ```yaml | ||
| # before | ||
| aice_file: /work/.../outdata/fesom/a_ice.fesom.*.nc | ||
| aice_variable: a_ice | ||
| ``` | ||
|
|
||
| ```yaml | ||
| # after | ||
| aice_path: /work/.../outdata/fesom | ||
| aice_pattern: a_ice\.fesom\..*\.nc | ||
| aice_variable: a_ice | ||
| ``` | ||
|
|
||
| The key triplet matches `_load_secondary_mf`'s convention | ||
| ([custom_steps.py:2153](examples/custom_steps.py#L2153)). | ||
|
|
||
| ### Step function side | ||
|
|
||
| For each custom step that reads a `*_file:` attribute: | ||
|
|
||
| ```python | ||
| # before | ||
| ds = xr.open_dataset(rule.aice_file, use_cftime=True) | ||
| aice = ds[rule.get("aice_variable", "a_ice")] | ||
| ``` | ||
|
|
||
| ```python | ||
| # after | ||
| aice = _load_secondary_mf(rule, "aice_path", "aice_pattern", "aice_variable") | ||
| ``` | ||
|
|
||
| `_load_secondary_mf` already: | ||
| - regex-matches files in the directory; | ||
| - year-filters via `filter_files_by_year_range` (with the | ||
| `skip_input_year_filter` opt-out from R2); | ||
| - opens via `open_mfdataset` (handles 1+ files); | ||
| - renames `time_counter` → `time` if requested; | ||
| - drops residual XIOS time bounds vars; | ||
| - selects the variable by name or auto-picks. | ||
|
|
||
| Most call sites that read `*_file:` do those steps manually anyway — | ||
| this consolidates them. | ||
|
|
||
| ### CLI override side | ||
|
|
||
| Remove R1 entirely: | ||
|
|
||
| - delete `_FESOM_FILE_RE` and `_expand_year_in_file_keys` from | ||
| [overrides.py](src/pycmor/core/overrides.py); | ||
| - delete the `if ov.year_start == ov.year_end` / `else` block in | ||
| `apply_overrides`; | ||
| - delete the R1-specific tests in | ||
| [test_overrides.py](tests/unit/test_overrides.py). | ||
|
|
||
| R2's `skip_input_year_filter` plumbing stays — it serves the centennial- | ||
| forcing rules independent of this migration. | ||
|
|
||
| --- | ||
|
|
||
| ## Scope of changes | ||
|
|
||
| | Component | Change | | ||
| |---|---| | ||
| | Yamls in `awi-esm3-veg-hr-variables/` | ~10 rule entries across 2 tiers (lrcs_seaice + core_ocean) | | ||
| | `examples/custom_steps.py` | ~10 custom step functions edited to call `_load_secondary_mf` | | ||
| | `src/pycmor/core/overrides.py` | net deletion — `_expand_year_in_file_keys`, `_FESOM_FILE_RE`, multi-year `OverrideError`, the entire R1 block in `apply_overrides` | | ||
| | `tests/unit/test_overrides.py` | drop R1 tests; R2 tests stay | | ||
| | `PLAN_cli_override_regressions.md` | mark R1 superseded | | ||
|
|
||
| --- | ||
|
|
||
| ## Trade-offs vs the workaround | ||
|
|
||
| | | R1 workaround | Proposed migration | | ||
| |---|---|---| | ||
| | Multi-year support | **impossible (raises OverrideError)** — blocks the 1700-year chunked run | works via `open_mfdataset` | | ||
| | Year filter is range-aware | no (single year only) | yes | | ||
| | Centennial-forcing opt-out | not applicable | works via `skip_input_year_filter` | | ||
| | Year/path coupling lives in | apply_overrides regex | rule yaml + helper | | ||
| | Net code in CLI override layer | grew by ~40 lines | shrinks by ~40 lines | | ||
|
|
||
| --- | ||
|
|
||
| ## Recommendation | ||
|
|
||
| Do the migration. R1 was the right call as a hot-fix to unblock the y1587 | ||
| single-year run, but the upcoming 1700-year chunked workload makes it a | ||
| blocker. The migration: | ||
|
|
||
| - unifies all secondary-input handling on one helper (`_load_secondary_mf`), | ||
| - removes a code path from the CLI override layer that had to know about | ||
| FESOM filename conventions, | ||
| - shrinks `apply_overrides` by ~40 lines, | ||
| - enables multi-year ranges (the 1700-year chunked case), | ||
| - preserves R2's `skip_input_year_filter` semantics for centennial inputs. | ||
|
|
||
| There's no value in deferring. R1 stays only as long as nothing needs | ||
| multi-year secondary inputs. | ||
|
|
||
| --- | ||
|
|
||
| ## Open questions — resolved | ||
|
|
||
| ### Q1: Per-rule key audit | ||
|
|
||
| To do during migration with | ||
| `grep -E '_file:.*fesom\.\*\.nc' awi-esm3-veg-hr-variables/`. No upfront | ||
| input needed. | ||
|
|
||
| ### Q2: `open_mfdataset` smoke test — passed | ||
|
|
||
| Tested against | ||
| `/work/bb1469/a270092/runtime/awiesm3-develop/after_lpjg_spinup_work_01/outdata/fesom/a_ice.fesom.{1900..1903}.nc`: | ||
|
|
||
| | Call | Result | Time | | ||
| |---|---|---| | ||
| | `xr.open_dataset(one_file)` | `time=12, nod2=126858` | 176 ms | | ||
| | `xr.open_mfdataset([one_file])` | `time=12, nod2=126858` | 22 ms | | ||
| | `xr.open_mfdataset(four_files)` | `time=48, nod2=126858` (concatenated correctly) | 260 ms | | ||
|
|
||
| Single-file `open_mfdataset` is in fact **faster** than `open_dataset` | ||
| (lazy-load); multi-file concatenates correctly along `time`. No | ||
| behavior regression for the migration. The y1700 chunked workload — | ||
| the original concern — gets the right shape automatically. | ||
|
|
||
| ### Q3: Promote `_load_secondary_mf` to `pycmor.std_lib`? | ||
|
|
||
| **No.** Audit of all callers | ||
| (`grep -rn '_load_secondary_mf' --include='*.py'`) shows every caller | ||
| is inside `examples/custom_steps.py` itself. No external consumers, no | ||
| yaml indirection that imports it from a stable path. User confirmed | ||
| `custom_steps.py` is user-owned and free to modify. | ||
|
|
||
| Keep it private. If a future project outside this codebase wants the | ||
| helper, that's the trigger to promote — premature now. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete before merge