Skip to content

fix(cache): handle file/directory path collisions reactively#20

Merged
robbiebyrd merged 1 commit into
mainfrom
fix/cache-path-collisions
Jun 15, 2026
Merged

fix(cache): handle file/directory path collisions reactively#20
robbiebyrd merged 1 commit into
mainfrom
fix/cache-path-collisions

Conversation

@robbiebyrd

Copy link
Copy Markdown
Collaborator

Problem

A URL namespace lets a node be both a leaf (the /a page) and an internal node (parent of /a/b), but a filesystem name can't be both a file and a directory. The recursive crawl hit this constantly (EEXIST/ENOTDIR/EISDIR), failing those cache writes. (Before the crash fix in #19 these also took the process down; now they just failed the job.)

Fix (reactive, at write time)

No change to where non-colliding URLs are stored — no cache migration, no content-type change:

  • writeStream EISDIR (slot already a directory because a child was cached first): store the page at <dest>/index.html (the directory-index form).
  • writeStream ENOTDIR (a legacy bare-file ancestor blocks creating the parent dir): skip gracefully and destroy the body (served live); re-caching the ancestor or clearing the host removes the blocker.
  • lookup: new statHit() helper treats a directory slot as a miss and falls through to the existing <path>/index.html probe, so the relocated page resolves.

Tests

792 → 795 (added: directory-slot lookup, EISDIR write fallback, ENOTDIR graceful skip). Full suite green, typecheck clean.

Note

Considered full canonicalization (extensionless → index.html) but it mistypes extensionless assets and forces a broad cache-layout migration; reactive handling reaches the same end state without those downsides.

🤖 Generated with Claude Code

… time

A URL namespace lets a node be both a leaf (the `/a` page) and an internal node
(parent of `/a/b`), but a filesystem name can't be both a file and a directory.
The recursive crawl hit this constantly (EEXIST/ENOTDIR/EISDIR), failing those
writes. Handle both directions where they occur, without changing where
non-colliding URLs are stored (no cache migration, no content-type change):

- writeStream EISDIR (slot already a directory because a child was cached
  first): store the page at `<dest>/index.html` — the directory-index form.
- writeStream ENOTDIR (a legacy bare-file ancestor blocks creating the parent
  directory): skip the write gracefully and destroy the body (served live on
  demand); re-caching the ancestor or clearing the host removes the blocker.
- lookup: factor a statHit() helper that treats a directory slot as a miss (a
  child was cached at this name) and falls through to the existing
  `<path>/index.html` directory-index probe, so the relocated page resolves.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@robbiebyrd robbiebyrd merged commit 1126856 into main Jun 15, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant