FileNotFoundError when re-reading just-written generation .mhl on SMB share — kills PyInstaller-frozen binary
Summary
When running ascmhl against a destination on an SMB-mounted share, ascmhl writes a new generation .mhl file, then immediately re-opens that same file to compute a hash for the parent chain. On SMB this re-open intermittently fails with FileNotFoundError: [Errno 2] and the uncaught exception terminates the (PyInstaller-frozen) ascmhl process with exit code 1, leaving the destination in a half-written state (new generation .mhl exists, but chain.xml is not updated).
Version
ascmhl 1.0.5.dev287+g8eab444.d20250304
Distributed as a PyInstaller-frozen binary (pyi-runtime-tmpdir reporter present: [PYI-32107:ERROR]).
Environment
- macOS host (Darwin)
- Destination volume: SMB share (
smbfs), mounted from //user@<server-ip>/volume to /Volumes/Volume/
- Source volume: local/external storage at
/Volumes/Volume
What ascmhl was doing
For a copy job from ascmhl:
- Copied existing ASC MHL histories from sub-packages on the source to the destination.
- Hashed/verified all files and reported
verified ... xxh128: OK for each.
- Wrote new generation
.mhl files, e.g.:
Created new generation Camera_Media/B_0057_1D0O_hde/ascmhl/0002_B_0057_1D0O_hde_2026-05-24_053403Z.mhl
Created new generation Camera_Media/A_0059_1C89_hde/ascmhl/0002_A_0059_1C89_hde_2026-05-24_053403Z.mhl
- ~5 ms later, while writing the parent chain, attempted to re-open and hash one of those just-written files in order to record its reference hash in
chain.xml — and crashed.
Traceback
Traceback (most recent call last):
FileNotFoundError: [Errno 2] No such file or directory: '/Volumes/ff1/FILMFLOW_PROJECTS/BELLEZA/OCF/SEMANA_03/20260523_Day018/Camera_Media/A_0059_1C89_hde/ascmhl/0002_A_0059_1C89_hde_2026-05-24_053403Z.mhl'
[PYI-32107:ERROR] Failed to execute script 'ascmhl' due to unhandled exception!
Exit code: 1.
Why this is a bug
The file in the traceback is the very file ascmhl just wrote, milliseconds earlier, in the same run (and the same path — no rename, no case change). On a local volume the open succeeds; on SMB it intermittently fails because the file is not yet visible to a stat()/open() by name through the SMB client cache (write-close-then-immediate-read race / directory-listing freshness gap).
This is a known characteristic of SMB and not something ascmhl should be expected to eliminate, but ascmhl is fragile against it in two ways:
-
No retry around the immediate post-write re-open in _hashlist_xml_element_from_hashlist → generate_reference_hash → hash_file. A short retry loop on FileNotFoundError (and/or an os.fsync of the parent directory after writing the generation .mhl) would absorb the SMB visibility gap that is the typical cause here.
-
The exception is uncaught at the CLI boundary, so it propagates all the way out and terminates the PyInstaller-frozen binary with [PYI-32107:ERROR]. The caller only sees exit code 1 and has no structured information about which file failed or which sub-package’s generation is now orphaned. A caught exception with a clear Error: failed to read just-written generation file <path>: <errno> would make this far easier to triage in the field, and would also allow ascmhl to clean up / mark the half-written state instead of leaving an orphan generation .mhl without a corresponding chain entry.
Suggested fix
In ascmhl/hasher.py (or in generate_reference_hash in ascmhl/hashlist.py), when hashing a generation file that ascmhl itself just wrote in the same session:
- After writing the generation
.mhl, perform an os.fsync on the file and (on POSIX) on the containing directory before continuing.
- Wrap the subsequent
open() of that file in a short bounded retry loop (e.g. up to ~1–2 s, with small sleeps) that catches FileNotFoundError / OSError(ENOENT) specifically. This is a tiny, well-scoped change that costs nothing on local filesystems and reliably absorbs SMB visibility gaps.
At the CLI boundary (ascmhl/cli/ascmhl.py), catch FileNotFoundError / OSError arising from chain writing and exit with a clear error message that names the offending path, rather than letting Python’s default handler print a raw traceback.
Reproduction
Not 100 % deterministic — it is a race. To reproduce reliably-ish:
- Mount an SMB share with default macOS settings (no
dir_cache_off) as the destination.
- Run a copy job that produces a new generation in a sub-package whose
ascmhl/ folder did not exist before this run (so the directory listing is new on the server side).
- Repeat. On enough runs, the immediate post-write read fails.
FileNotFoundErrorwhen re-reading just-written generation.mhlon SMB share — kills PyInstaller-frozen binarySummary
When running ascmhl against a destination on an SMB-mounted share, ascmhl writes a new generation
.mhlfile, then immediately re-opens that same file to compute a hash for the parent chain. On SMB this re-open intermittently fails withFileNotFoundError: [Errno 2]and the uncaught exception terminates the (PyInstaller-frozen)ascmhlprocess with exit code 1, leaving the destination in a half-written state (new generation.mhlexists, butchain.xmlis not updated).Version
Distributed as a PyInstaller-frozen binary (
pyi-runtime-tmpdirreporter present:[PYI-32107:ERROR]).Environment
smbfs), mounted from//user@<server-ip>/volumeto/Volumes/Volume//Volumes/VolumeWhat ascmhl was doing
For a copy job from ascmhl:
verified ... xxh128: OKfor each..mhlfiles, e.g.:chain.xml— and crashed.Traceback
Exit code:
1.Why this is a bug
The file in the traceback is the very file ascmhl just wrote, milliseconds earlier, in the same run (and the same path — no rename, no case change). On a local volume the open succeeds; on SMB it intermittently fails because the file is not yet visible to a
stat()/open()by name through the SMB client cache (write-close-then-immediate-read race / directory-listing freshness gap).This is a known characteristic of SMB and not something ascmhl should be expected to eliminate, but ascmhl is fragile against it in two ways:
No retry around the immediate post-write re-open in
_hashlist_xml_element_from_hashlist→generate_reference_hash→hash_file. A short retry loop onFileNotFoundError(and/or anos.fsyncof the parent directory after writing the generation.mhl) would absorb the SMB visibility gap that is the typical cause here.The exception is uncaught at the CLI boundary, so it propagates all the way out and terminates the PyInstaller-frozen binary with
[PYI-32107:ERROR]. The caller only sees exit code 1 and has no structured information about which file failed or which sub-package’s generation is now orphaned. A caught exception with a clearError: failed to read just-written generation file <path>: <errno>would make this far easier to triage in the field, and would also allow ascmhl to clean up / mark the half-written state instead of leaving an orphan generation.mhlwithout a corresponding chain entry.Suggested fix
In
ascmhl/hasher.py(or ingenerate_reference_hashinascmhl/hashlist.py), when hashing a generation file that ascmhl itself just wrote in the same session:.mhl, perform anos.fsyncon the file and (on POSIX) on the containing directory before continuing.open()of that file in a short bounded retry loop (e.g. up to ~1–2 s, with small sleeps) that catchesFileNotFoundError/OSError(ENOENT)specifically. This is a tiny, well-scoped change that costs nothing on local filesystems and reliably absorbs SMB visibility gaps.At the CLI boundary (
ascmhl/cli/ascmhl.py), catchFileNotFoundError/OSErrorarising from chain writing and exit with a clear error message that names the offending path, rather than letting Python’s default handler print a raw traceback.Reproduction
Not 100 % deterministic — it is a race. To reproduce reliably-ish:
dir_cache_off) as the destination.ascmhl/folder did not exist before this run (so the directory listing is new on the server side).