labrador combines simulation-based inference with gravitational-wave specific tricks such as relative binning, folding, and coordinate transformations, to get the best of both worlds.
labrador: A domain-optimized machine-learning tool for gravitational wave inference
git clone git@github.com:jroulet/labrador.gitconda create -n ENVIRONMENT_NAME pip cogwheel-pe sbi -c conda-forge
conda activate ENVIRONMENT_NAME(replace ENVIRONMENT_NAME by a name of your choice, e.g. labrador.)
Note: it's better to install those packages with
condarather thanpip, at least in the LIGO Data Grid computers.
cd labrador
pip install -e .See notebooks/workflow.ipynb or use the cheatsheet below.
lab-setup-rundir-and-priordir PARENTDIR
# Edit config files...
lab-generate-data-htcondor PRIORDIR --submit-arg accounting_group=ACCOUNTING_GROUP --submitNote: this submits a
.dagfile that in turn orchestrates several.subfiles. If you get a crash due to insufficient resources, you may adjust the requests in the corresponding.sub, delete from the.dagthose jobs that have already succeeded, delete the.rescuefile, and resubmit the.dagwithcondor_submit_dag DAGMAN_PATH.
lab-setup-rescalerdir PRIORDIR
python -m labrador.rescaling RESCALERDIRlab-setup-sbidir RESCALERDIR
python -m labrador.training SBIDIRlab-setup-unfolderdir RESCALERDIR
python -m labrador.unfolding UNFOLDERDIRSee https://zenodo.org/records/19393278 for a demonstration with already-trained models.
Diagnose with
condor_q
Find which jobs were held:
cd {rundir}/submission_scripts
grep -R held
This will point to the relevant log files. Example output:
simulation-7190000_7200000_train.log:012 (527015058.719.000) 2026-03-18 14:57:48 Job was held.
simulation-4890000_4900000_train.log:012 (527015058.489.000) 2026-03-18 14:51:48 Job was held.
simulation-5080000_5090000_train.log:012 (527015058.508.000) 2026-03-18 14:52:48 Job was held.
Check the logs. Two causes for eviction are
-
Insufficient resources requested (e.g. memory), in that case edit the relevant .sub file and resubmit.
-
Bad nodes. In that case the NumShadowStarts variable will be high:
grep -R NumShadowStartsExample output
simulation-7190000_7200000_train.log: NumShadowStarts 148 > 100. simulation-4890000_4900000_train.log: NumShadowStarts 118 > 100. simulation-5080000_5090000_train.log: NumShadowStarts 127 > 100.Reset like so:
condor_qedit 527015058.719 NumShadowStarts 0 condor_qedit 527015058.489 NumShadowStarts 0 condor_qedit 527015058.508 NumShadowStarts 0(get the correct job numbers for your case from the
grep -R heldoutput). Then resubmit:condor_release albert.einstein
We are grateful to Eliot Finch for designing the labrador logo.
