fix(cache): handle file/directory path collisions reactively#20
Merged
Conversation
… time A URL namespace lets a node be both a leaf (the `/a` page) and an internal node (parent of `/a/b`), but a filesystem name can't be both a file and a directory. The recursive crawl hit this constantly (EEXIST/ENOTDIR/EISDIR), failing those writes. Handle both directions where they occur, without changing where non-colliding URLs are stored (no cache migration, no content-type change): - writeStream EISDIR (slot already a directory because a child was cached first): store the page at `<dest>/index.html` — the directory-index form. - writeStream ENOTDIR (a legacy bare-file ancestor blocks creating the parent directory): skip the write gracefully and destroy the body (served live on demand); re-caching the ancestor or clearing the host removes the blocker. - lookup: factor a statHit() helper that treats a directory slot as a miss (a child was cached at this name) and falls through to the existing `<path>/index.html` directory-index probe, so the relocated page resolves. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A URL namespace lets a node be both a leaf (the
/apage) and an internal node (parent of/a/b), but a filesystem name can't be both a file and a directory. The recursive crawl hit this constantly (EEXIST/ENOTDIR/EISDIR), failing those cache writes. (Before the crash fix in #19 these also took the process down; now they just failed the job.)Fix (reactive, at write time)
No change to where non-colliding URLs are stored — no cache migration, no content-type change:
writeStreamEISDIR (slot already a directory because a child was cached first): store the page at<dest>/index.html(the directory-index form).writeStreamENOTDIR (a legacy bare-file ancestor blocks creating the parent dir): skip gracefully and destroy the body (served live); re-caching the ancestor or clearing the host removes the blocker.lookup: newstatHit()helper treats a directory slot as a miss and falls through to the existing<path>/index.htmlprobe, so the relocated page resolves.Tests
792 → 795 (added: directory-slot lookup, EISDIR write fallback, ENOTDIR graceful skip). Full suite green, typecheck clean.
Note
Considered full canonicalization (extensionless →
index.html) but it mistypes extensionless assets and forces a broad cache-layout migration; reactive handling reaches the same end state without those downsides.🤖 Generated with Claude Code