ci(mutants): add --in-place to dodge cargo-mutants v27.0.0 #611 tmp-tree bug#41
Merged
Merged
Conversation
…ree bug Both full sweeps run after #37/#38/#39/#40 landed (https://github.com/mcarvin8/config-disassembler/actions/runs/25807382458 and https://github.com/mcarvin8/config-disassembler/actions/runs/25812733214) died with the exact same upstream error: ERROR cargo_mutants::lab: Worker thread failed: "/tmp/cargo-mutants-config-disassembler-<rand>.tmp/src/xml/cli.rs" is not a file Even though missed.txt was empty in both reports, the worker thread crashed before cargo-mutants could write its summary, exit code was 1, and the GitHub Actions step was marked failed. Root cause (upstream) --------------------- This is cargo-mutants v27.0.0 issue sourcefrog/cargo-mutants#611, a regression introduced by PR sourcefrog/cargo-mutants#557 in v26.0.0 (Dec 2025). cargo-mutants copies the source tree into a per-mutant scratch directory under `/tmp` using `reflink::reflink`, which preserves the *source* file's mtime exactly. On macOS, `/usr/libexec/dirhelper` periodically scans `/tmp` and unlinks regular files whose mtime is older than CLEAN_FILES_OLDER_THAN_DAYS (default 3). The systemd-tmpfiles equivalent on Linux does the same thing on a configurable cadence. Any source file in the repo that hasn't been edited in three days ends up in the scratch tree with a stale mtime, the reaper unlinks it mid-run, and the next mutant's `BuildDir::overwrite_file` trips `ensure!(full_path.is_file(), "{full_path:?} is not a file")` at src/build_dir.rs:96 and the worker thread dies. The bug is load-dependent: short runs that finish between reaper invocations do not trigger it. cargo-mutants' debug.log on our two failing runs shows the failure both times immediately after the revert of `parse_reassemble_args -> (None, None, true)` -- the only xml/cli.rs file in the repo with sufficiently stale mtime to be a consistent reaper target on the GitHub Actions ubuntu-latest runner. The fix landed upstream in PR sourcefrog/cargo-mutants#613, merged May 11 2026, which bumps dest mtime to now after every reflink and gives a clearer error message. But the latest tagged cargo-mutants release (v27.0.0, 2026-03-07) does **not** contain the fix, and `taiki-e/install-action` installs from tagged binary releases via cargo-binstall. Workaround ---------- The upstream issue explicitly recommends `--in-place` for users not ready to switch off v27.0.0: it bypasses the scratch-tree copy entirely and runs mutations against the workspace source files in the runner's checkout directory. On ephemeral CI runners that's exactly what we want -- the runner is thrown away after the job, so "mutating the actual source tree" has no downside. `--in-place` takes the same `--in-diff` / `--file` filters and produces the same report shape as the default mode. Both the `full` and `incremental` jobs are updated: * `full` uses `cargo mutants --no-shuffle --in-place`. * `incremental` uses `cargo mutants --no-shuffle --in-place --in-diff mutation.diff` for the fast path and `cargo mutants --no-shuffle --in-place --file ...` for the test-only-diff fallback. Each command grows a single comment block linking back to upstream #611/#613 so the next maintainer can drop the flag once cargo-mutants ships a fixed release. Verification plan ----------------- After merging this PR, trigger the `Full mutation testing` workflow manually (`gh workflow run mutation.yml -f full=true`). Expected outcome: no `is not a file` worker error, exit code 0, the workflow step turns green, and the mutants.out artifact still reports missed=0 with the same caught / timeout buckets as the previous runs. Co-authored-by: Cursor <cursoragent@cursor.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 🚀 New features to boost your workflow:
|
This was referenced May 13, 2026
Merged
mcarvin8
added a commit
that referenced
this pull request
May 13, 2026
Switch both the `incremental` and `full` jobs of the mutation workflow from `taiki-e/install-action@v2` (which installs the latest tagged binary release, v27.0.0 from 2026-03-07) to `cargo install --git sourcefrog/cargo-mutants --rev cbdfe8a` (the merge commit of upstream PR sourcefrog/cargo-mutants#613, merged May 11 2026). Why --- Three consecutive full mutation sweeps on `main` after #37/#38/#39/#40 landed all crashed at the same point with cargo-mutants v27.0.0: ERROR Worker thread failed: ".../src/xml/cli.rs" is not a file Error: ".../src/xml/cli.rs" is not a file Even though `missed.txt` was empty (0 missed mutants -- the score goal was met), the worker thread died before cargo-mutants could write its summary and the workflow step turned red. Investigation: * The first two failures had the missing-file path inside the per- mutant scratch tempdir (`/tmp/cargo-mutants-config-disassembler-XXX.tmp/src/xml/cli.rs`). These match upstream issue sourcefrog/cargo-mutants#611 exactly: cargo-mutants v26.0.0+ uses `reflink::reflink` for the scratch-tree copy, which preserves the source mtime. Systemd-tmpfiles (Linux) and `/usr/libexec/dirhelper` (macOS) periodically delete files in `/tmp` with mtime older than a configurable threshold. Long-running sweeps cross that interval and have files silently unlinked from under them mid-run. * The third failure was added in a follow-up attempt that switched to `--in-place` (#41 as originally proposed). The missing-file path this time was the *workspace* path (`/home/runner/work/.../src/xml/cli.rs`), which can't be a `dirhelper`/`systemd-tmpfiles` artifact. The same misleading error message in `BuildDir::overwrite_file` covers a separate failure mode that v27.0.0's `ensure!(full_path.is_file(), ...)` cannot distinguish. Upstream PR #613 fixes both halves of this: 1. `src/copy_tree.rs`: bumps `dest` mtime to `now()` after every successful `reflink::reflink`, so reaper services see freshly- touched files and leave them alone. Closes #611. 1. `src/build_dir.rs`: rewrites `BuildDir::overwrite_file` to use `symlink_metadata` and emit specific error messages distinguishing `is a symlink` / `is not a regular file (type is X)` / `does not exist` / `failed to stat`. Whatever's actually going wrong on our `--in-place` run will finally surface as a useful diagnostic instead of "is not a file". What changed ------------ Both jobs: * Replace `uses: taiki-e/install-action@v2 / tool: cargo-mutants` with `run: cargo install --locked --git https://github.com/sourcefrog/cargo-mutants --rev cbdfe8a574566e01cef9ffaa7475dfaf69c88440 cargo-mutants`. The rev is pinned to the exact merge commit of #613 so the install is reproducible; `cargo install --locked` honours the upstream `Cargo.lock` for the same reason. * Drop the `--in-place` flag that the previous version of this PR added: the underlying cause is the mtime issue (or whatever the improved error message surfaces), not the copy-vs-in-place mode. Default copy mode is the upstream-recommended default and gets the full benefit of the mtime fix. Cost ---- `cargo install --git` builds cargo-mutants from source, which takes ~2-3 minutes on a cold runner. `Swatinem/rust-cache@v2` (already in the workflow) caches `~/.cargo/registry`, `~/.cargo/git`, and the build's `target/`, so warm runs are much faster -- typically under a minute. Revisit once a 27.x.x release ships with this fix and switch back to `taiki-e/install-action@v2`. Verification plan ----------------- After merging this PR, trigger the `Full mutation testing` workflow manually (`gh workflow run mutation.yml -f full=true`). Expected outcomes: * `cargo-mutants` reports its version as a post-v27.0.0 git build (`cargo-mutants 27.0.0+...` or similar). * `missed.txt` stays at 0. * No `is not a file` worker error. * If a *different* error surfaces, it'll be the new specific message from #613 (`is a symlink`, `is not a regular file (type is ...)`, `does not exist, refusing to create it`, etc.), which will tell us exactly what to fix next. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Both full mutation sweeps run after #37/#38/#39/#40 landed -- run 25807382458 and run 25812733214 -- died with the exact same upstream error after passing the actual mutation score check (
missed.txtwas empty in both):The worker thread crashes before
cargo-mutantscan write its summary, the step exit code is 1, and the workflow goes red even though the underlying score is 100%.Root cause (upstream)
This is
cargo-mutantsv27.0.0 issue #611, a regression introduced by #557 in v26.0.0 (Dec 2025).cargo-mutantscopies the source tree into a per-mutant scratch directory under/tmpusingreflink::reflink, which preserves the source file's mtime exactly. On macOS,/usr/libexec/dirhelperperiodically scans/tmpand unlinks regular files whose mtime is older thanCLEAN_FILES_OLDER_THAN_DAYS(default 3); on Linux,systemd-tmpfilesdoes the same on a configurable cadence. Any file in the repo with sufficiently-old mtime ends up in the scratch tree as a reaper target, gets silently unlinked mid-run, and the next mutant'sBuildDir::overwrite_filetrips this check atsrc/build_dir.rs:96:The bug is load-dependent: short runs that finish between reaper invocations don't trigger it. Our
debug.logon both failing runs shows the failure immediately after the revert ofparse_reassemble_args -> (None, None, true), which is consistently late enough in the run for the reaper to have already fired and unlinkedsrc/xml/cli.rs(the file in our repo with the right combination of stale source mtime + position in cargo-mutants' deterministic mutant order).The fix landed upstream in #613, merged May 11 2026, which bumps dest mtime to
now()after every reflink and gives a clearer error message. But the latest taggedcargo-mutantsrelease (v27.0.0, 2026-03-07) does not contain the fix, andtaiki-e/install-actioninstalls from tagged binary releases viacargo-binstall.Workaround
The upstream issue explicitly recommends
--in-placefor users not ready to switch off v27.0.0: it bypasses the scratch-tree copy entirely and runs mutations against the workspace source files in the runner's checkout directory. On ephemeral CI runners that's exactly what we want -- the runner is thrown away after the job, so "mutating the actual source tree" has no downside.--in-placetakes the same--in-diff/--filefilters and produces the same report shape as the default mode.Both the
fullandincrementaljobs are updated:full:cargo mutants --no-shuffle --in-placeincrementalfast path:cargo mutants --no-shuffle --in-place --in-diff mutation.diffincrementaltest-only-diff fallback:cargo mutants --no-shuffle --in-place --file ...Each command grows a single comment block linking back to upstream #611 / #613 so the next maintainer can drop the flag once
cargo-mutantsships a fixed release.Test plan
Full mutation testing(gh workflow run mutation.yml -f full=true).is not a fileworker error, exit code 0, the workflow step turns green, and themutants.outartifact still reportsmissed=0with the samecaught/timeout/unviablebuckets as the prior post-refactor(xml/cli): use iterator-based loop in parse_disassemble_args #40 run.