fix: Stop node execution from deleting or moving pre-existing datasets in workspace#41
fix: Stop node execution from deleting or moving pre-existing datasets in workspace#41floo-dck wants to merge 1 commit into
Conversation
…rkspace Snapshots the workspace file state before running code in `exec_node`. This ensures that pre-existing assets like datasets (e.g., MovieLens) are not mistakenly identified as newly generated artifacts, which previously caused them to be moved or permanently deleted by the file whitelist cleaner.
eisenbahnhero
left a comment
There was a problem hiding this comment.
I’m not entirely sure whether this error can actually occur. In my opinion, Omnirec should simply reload the datasets. After all, each node is effectively a self-contained process.
Do you maybe have an AutoRecLab run where this error occurred?
We’ve already carried out a few experiments ourselves with keep_only_relevant_files enabled and haven’t actually encountered this problem so far. But that doesn’t mean, of course, that it can’t happen.
|
Thanks for the feedback! To prove exactly how this failure loop triggers, I have attached my log file and referenced the explicit screenshots from my environment below. To clarify: in my setup, Here is the step-by-step breakdown of the failure loop documented in the logs:
Semantic IssueFurthermore, moving the dataset into My Log: debug.log
|



Summary
This PR introduces a workspace snapshotting mechanism in
exec_noderight before the interpreter runs. By capturing the existing file state, we ensure that only newly generated artifacts are moved to the node's checkpoint directory, protecting static assets like datasets from being deleted.Problem
Previously, the code gathered all files present in the workspace directory after execution to clean them up or move them to checkpoints.
If a large dataset (like
movielens-100k) was placed inside the workspace for the recommender experiments to use, the tree search runner classified it as a "generated file". Because dataset files do not match the image whitelist (.png,.jpg, etc.), they were automatically wiped out byshutil.rmtreeoros.removewhenkeep_only_relevant_fileswas active. This completely broke subsequent experiment iterations that relied on the data.Solution
files_before_exec = set(workspace_dir.rglob("*"))immediately prior toself._interpreter.run(current_code).working/directory to strictly process items whereitem not in files_before_exec.