Sub-issue for #39: portable training input workflow for wcgpu and sgpu
Issue
The training_inputs/Snakefile from git runs on wcgpu, but the sgpu machine needs different site-local resources:
- There is no
/nfs file system in sgpu machine. Apptainer binding in Snakefile binding needs to address this (see the comment in Snakefile)
- the Apptainer bind list must expose GPFS paths inside the container;
- the default DUNE cosmic-generation FHiCL reads CORSIKA shower database files from CVMFS, while
sgpu needs the local GPFS copy at /gpfs01/lbne/users/spng/stash/Cosmics/CERN/CORSIKA/standard.
Changes made
- Updated the top-level
Snakefile to use one configurable DUNE software container path:
- defaults to the original
wcgpu container, /nfs/data/1/calcuttj/dunesw.sif;
- auto-selects the
sgpu GPFS container when /gpfs/mnt/gpfs01/lbne/users/spng/abashyal/wct-env/wire-cell-dev/container/dunesw.sif exists;
- can be overridden explicitly with Snakemake config key
dunesw_container or environment variable DUNESW_CONTAINER.
- Updated
test_container, gen_cosmics, g4, and save_depos to use the shared config['dunesw_container'] value.
- Updated the Apptainer bind list in the
Snakefile shebang to include both the original /nfs bind and the GPFS paths needed by sgpu.
- Added
prod_cosmics_protodunehd_local.fcl, a small wrapper around prod_cosmics_protodunehd.fcl that overrides only physics.producers.cosmicgenerator.ShowerInputFiles to use the sgpu GPFS CORSIKA shower database.
- Updated
gen_cosmics to take its FHiCL through params.fcl instead of hard-coding prod_cosmics_protodunehd.fcl. This preserves the original lar FHiCL lookup behavior on wcgpu, where prod_cosmics_protodunehd.fcl does not need to be a local Snakemake input.
- Added automatic FHiCL selection:
- use
prod_cosmics_protodunehd_local.fcl when the sgpu CORSIKA directory exists;
- otherwise keep the original
prod_cosmics_protodunehd.fcl behavior for wcgpu;
- allow explicit override with Snakemake config key
cosmics_fcl or environment variable COSMICS_FCL.
This keeps the git Snakefile behavior for wcgpu while folding in the sgpu fixes without maintaining a separate machine-specific Snakefile.
Sub-issue for #39: portable training input workflow for wcgpu and sgpu
Issue
The
training_inputs/Snakefilefrom git runs onwcgpu, but thesgpumachine needs different site-local resources:/nfsfile system in sgpu machine. Apptainer binding inSnakefilebinding needs to address this (see the comment inSnakefile)sgpuneeds the local GPFS copy at/gpfs01/lbne/users/spng/stash/Cosmics/CERN/CORSIKA/standard.Changes made
Snakefileto use one configurable DUNE software container path:wcgpucontainer,/nfs/data/1/calcuttj/dunesw.sif;sgpuGPFS container when/gpfs/mnt/gpfs01/lbne/users/spng/abashyal/wct-env/wire-cell-dev/container/dunesw.sifexists;dunesw_containeror environment variableDUNESW_CONTAINER.test_container,gen_cosmics,g4, andsave_deposto use the sharedconfig['dunesw_container']value.Snakefileshebang to include both the original/nfsbind and the GPFS paths needed bysgpu.prod_cosmics_protodunehd_local.fcl, a small wrapper aroundprod_cosmics_protodunehd.fclthat overrides onlyphysics.producers.cosmicgenerator.ShowerInputFilesto use thesgpuGPFS CORSIKA shower database.gen_cosmicsto take its FHiCL throughparams.fclinstead of hard-codingprod_cosmics_protodunehd.fcl. This preserves the originallarFHiCL lookup behavior onwcgpu, whereprod_cosmics_protodunehd.fcldoes not need to be a local Snakemake input.prod_cosmics_protodunehd_local.fclwhen thesgpuCORSIKA directory exists;prod_cosmics_protodunehd.fclbehavior forwcgpu;cosmics_fclor environment variableCOSMICS_FCL.This keeps the git
Snakefilebehavior forwcgpuwhile folding in thesgpufixes without maintaining a separate machine-specificSnakefile.