Run on central AnaTuples/AnaCaches via version flags + fix remote log location reporting#269
Merged
kandrosov merged 2 commits intoJun 15, 2026
Conversation
…ion/--anaCache-version/--ana-version and prune upstream production deps
…or/bundle jobs; FLAF_PATH/CORRECTIONS_PATH-based dev overlay (no AFS access in bundle mode) and deterministic hello-world test
378adc1 to
6da2bf0
Compare
Collaborator
|
pipeline#15033697 started |
Collaborator
|
pipeline#15033697 passed |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related changes for working with centrally produced AnaTuples/AnaCaches and for
correct failed-job log reporting with the bundle system.
1. Run on existing central AnaTuples/AnaCaches (consume side)
Adds convenience version parameters so a run can read centrally produced upstream
outputs while producing its own downstream outputs at
--version:--anaTuple-version— forces the version of all upstream AnaTuple/AnaProd tasks(
InputFileTask,AnaTuple*List*,AnaTupleMergeTask).--anaCache-version— forces the version ofAnalysisCacheTask/AnalysisCacheAggregationTask(central BtagShape etc.).--ana-version— single flag combining both.Upstream production dependencies (and their bundles) are pruned via
self.complete()early-returns when the central outputs already exist, so
InputFileTaskand friends nolonger appear in the graph.
bundle_flavourscarries a(flavour, version)tuple so theAnaTupleFileListbundle is pulled at the central version.Verified by constructing the task graph: with
--anaTuple-version v2605a,AnaTupleMergeTask/AnaTupleFileListTaskresolve tov2605awhileAnalysisCacheTask.versionstays at the dev--version;--anaCache-versionresolves thecache tasks to the central version.
2. fix #265: failed-task log locations now point at EOS
With bundles, job logs are staged to EOS by
stageout_logs.sh, but failure reports stilladvertised the old AFS path. PR #267 added
_BundleAwareHTCondorWorkflowProxyto remap thelogentry, but it never took effect:workflow_proxy_clswas set in theclass body (
_defined_workflow_proxy); since Fixed reported log file locations for failed tasks with new remote log location #267 rebound it afterwards, the flag stayedFalseand the proxy was never selected.self, which carries nofs_default/version/period.This PR completes the fix:
HTCondorWorkflow._defined_workflow_proxy = Trueso the proxy is actually used;_submit_groupfromself.task(
remote_dir_target(version, "logs", <TaskClass>, period)), matching whatstageout_logs.shuploads;
and computation as the primary path);
stageout_logs.shpre-creates the remote parent dir (gfal-mkdir -p) so the log lands atthe exact single path instead of a nested
stdall_0To1.txt/stdall_0To1.txt.Also adds a worker-side dev overlay (
bootstrap.shsourcesflaf_dev.sh; bundle/stageoutscripts resolved via
FLAF_PATH) so jobs run edited FLAF on the worker, and a deterministictest/test_hello_world.py+test/test_hello_world_all.pyharness.Testing
test/test_hello_world_all.pyruns 6 variants (local/htcondor × success/force-fail, ± bundle).The force-fail htcondor variants assert the reported
log:path equals the remote EOS URL andthat no AFS
/data/.../stdallpath is mentioned. All 6 pass. A dedicated re-run after the #267cleanup confirmed the failure report shows
log: davs://eoshome-k.cern.ch:8444/.../logs/HelloWorldTask/Run3_2022EE/stdall_0To1.txt.Notes