unix: per-file desk auto-sync (%wath/%wend)#1031
Draft
mopfel-winrux wants to merge 7 commits into
Draft
Conversation
Implements the runtime side of clay's %wath task. A mount point marked for auto-sync is watched with libuv fs-event watchers (one per directory); changes are debounced (~100ms), only the affected subtrees are rescanned, and a %into event is injected containing only the files that actually changed. Also fixes three event-log amplification bugs that affect plain |commit and |autocommit: - empty %into events are no longer injected when a scan finds no changes (previously every autocommit tick logged one) - per-file sync state (gum_w) is persisted per mount point in .urb/syn/<mon>.mug and reloaded at startup, so the first commit after a restart no longer writes the entire desk into the event log; mugs are recorded when the %into event commits (via ovum callbacks) and when %ergo is applied - file-content mugs are now always computed over the octet-stream at its declared length; previously a noun mug was cached on one side of the comparison, so files with trailing zero bytes were re-sent on every scan, and %ergo echoes of just-synced files are no longer redundantly rewritten to disk The dry flags, previously written but never read back usefully, now implement real dirty-tracking: scans mark nodes dry and only wet subtrees are re-examined. %dirk commits wet the whole mount first, preserving existing full-scan behavior.
The node tree is only built during scans, so applying the persisted mug cache to nodes at %hill time (when only top-level nodes exist) seeded nothing, and the post-boot reconciliation re-sent the whole desk anyway. The loaded cache is now kept on the mount point and consulted as file nodes are created. Saving had the inverse problem: rewriting the cache from a sparse tree (e.g. at exit from a session that never scanned) truncated it. The cache is now maintained as a merge: tree mugs are folded in on save, and entries are dropped when files are deleted. Verified on a fake ship: a no-op restart with an auto-synced desk adds ~0 bytes to the event log (previously ~4.2MB, the full desk).
Editors rarely save atomically: common patterns are write-temp-then- rename (safe: content lands atomically under the final name), delete-then-write, and truncate-then-write. The latter two expose windows where the file is missing or partial, and a fixed-delay debounce could fire inside one, committing a transient deletion or truncated content to clay. Two heuristics close this: - debounce until quiescence: each fs event extends the coalescing window (100ms of silence), capped at 1s from the first change so a continuously-writing process can't starve sync - deletion grace: a missing file is not synced as deleted until it has stayed missing through a recheck 300ms later; a file deleted and rewritten in between syncs as a single modification Verified on a fake ship: rm + rewrite with a 50ms gap produces exactly one modification commit (revision +1, no transient delete), and a real rm syncs as a deletion ~0.5s later.
mingw's rename() fails if the destination exists, so the sidecar save would fail on every write after the first. Use MoveFileEx with MOVEFILE_REPLACE_EXISTING on windows. The rest of the auto-sync machinery cross-compiles clean for x86_64-windows-gnu; libuv backs uv_fs_event with ReadDirectoryChangesW there.
Includes the unix driver source into the test (with its exported symbols renamed) to reach the internals, and covers: - mug-cache (lod) insert/update/delete and sort invariants - lazy mug seeding of new file nodes from the loaded cache - node-tree path lookup - sidecar save/load round-trip, atomic re-save, and wholesale discard on unknown header, malformed line, or missing file - octet-stream mugs at declared length (trailing-zero regression)
- fold the node tree into the mug cache by collect/qsort/merge instead of per-entry sorted insertion (O(n log n) vs O(n^2) per save; desks run to thousands of files), retiring _unix_lod_put - scope %into completion to the originating mount: record the mount name in the pending-sync record, update nodes and rewrite the sidecar for that mount only, instead of searching and saving all mounts per event - extract _unix_doom_hold() for the deletion-grace check previously duplicated across the two scan sites - tighten comments that restated their code
This was referenced Jun 10, 2026
fs notification mechanisms can drop events (inotify queue overflow, exhausted watch descriptors, platform edge cases). add a repeating timer that marks every auto-synced mount wet every 30s and schedules a scan through the normal debounce path. an unchanged tree produces an empty change list, which never injects an event, so idle cost is one mug pass per sweep.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements the runtime side of per-file desk auto-sync — see the accompanying UIP draft (urbit/UIPs PR to follow) and the Arvo side in urbit/urbit#7362.
A mount point marked for auto-sync is watched with libuv fs-event watchers (one per directory). Changes are debounced until quiescence (100ms quiet, 1s cap), only affected subtrees are rescanned, and a
%intoevent is injected containing only the files that actually changed. Deletions are confirmed through a 300ms grace period so editor delete-then-write save patterns don't propagate transient deletions.Also fixes three event-log amplification bugs that affect plain
|commit/|autocommit:%intoevents are no longer injected when a scan finds no changes (previously every autocommit tick logged one).urb/syn/<mon>.mug) and reloaded at startup, so the first commit after a restart no longer writes the entire desk into the event logMeasured on a fake ship: 60s idle with auto-sync active adds 0 bytes to the event log (vs ~120 events/min under
|autocommit), and a no-op restart adds ~0 bytes (previously the full desk, ~4.2MB). Unit tests cover the mug cache, lazy seeding, tree lookup, and sidecar round-trip/corruption handling; cross-compiles clean for x86_64-windows-gnu.