Add incremental build support and watch mode to leadtype generate#119
Add incremental build support and watch mode to leadtype generate#119BurnedChris wants to merge 1 commit into
leadtype generate#119Conversation
- Introduced `--watch` option for `leadtype generate`, enabling automatic rebuilds on changes to documentation sources and configuration files. - Implemented incremental builds by caching file inputs and skipping unchanged files, improving performance during repeated runs. - Updated the library API to support a new `cache` option in `convertAllMdx`, allowing for dependency tracking and cache invalidation. - Enhanced documentation to reflect these new features, including detailed explanations of the caching mechanism and watch mode behavior. These changes significantly enhance the efficiency and usability of the `leadtype` CLI for documentation generation.
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR adds ChangesWatch mode and incremental builds
Estimated code review effort: 4 (Complex) | ~60 minutes Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as generate CLI
participant Watcher as watchInputs
participant Convert as convertAllMdx
participant Cache as Cache Manifest
User->>CLI: leadtype generate --watch
CLI->>Convert: initial conversion run
Convert->>Cache: load manifest, hash sources
Cache-->>Convert: cached/stale entries
Convert->>Convert: skip cached, rebuild changed, prune stale
Convert->>Cache: save updated manifest
CLI->>Watcher: arm watcher(watchPaths)
Watcher-->>CLI: debounced change event
CLI->>Convert: rerun conversion
Convert->>Cache: revalidate & update manifest
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
✅ No new issues found.
Reviewed changes — adds --watch/-w and default incremental builds to leadtype generate, with a content-hash manifest cache, dependency tracking for <include> targets and type-table TypeScript sources, and stale-output pruning.
- CLI flags & docs —
parseGenerateArgsgainswatchandforce; help text anddocs/reference/cli.mdxdocument incremental builds and watch mode;.changesetmarks aminorbump. - Incremental cache (
convert/incremental.ts) — per-file manifest undernode_modules/.cache/leadtype/, recordingsourceHash, per-dependency hashes, serialized gitenrichment, and the output path. Every manifest miss (missing, corrupt, version or fingerprint mismatch) degrades to a full rebuild rather than erroring; the save path writes-then-renames to stay crash-safe. convertAllMdxcaching — skips a file only when its source hash, git enrichment, and all dependency hashes match and the prior output still exists;pruneStaleOutputsdeletes outputs of deleted sources, scoped strictly insideoutDir.- Dependency protocol — reuses the
_compiler.addDependencyfile-data channel (already emitted by the include plugin) and extendsTypeTableOptionswithonDependencyso type-table TypeScript reads invalidate the cache too. - Cache fingerprint —
resolveConvertCachekeys the cache on leadtype version, docs-config content hash, and conversion-relevant flags; caching is skipped entirely when there is nonode_modules/. - Watch loop (
cli/watch.ts,runGenerateWatch) — debounced recursivefs.watch(fine givenengines >= 22), ignoresoutDir/.git/node_modules/.DS_Store, serializes reruns, re-arms when a config edit changes the docs-source set, and stops cleanly onSIGINT/SIGTERM.
I verified the load-bearing seam: downstream site artifacts (search index, llms.txt, agents) read the converted markdown from outDir on disk rather than from convertAllMdx's return value, so a cached skip that leaves the output in place stays consistent — no stale-artifact bug. Dependency paths are recorded as real source paths (mapped back through gitSourcePath), so they survive the temp-mirror churn between runs. Test coverage is strong. The points below are informational only; nothing blocks merge.
ℹ️ --force is sticky across every watch rebuild
In watch mode each rebuild re-invokes generation with the original parsed args, so leadtype generate --watch --force reconverts every file on every change for the life of the session rather than forcing once and then rebuilding incrementally. That is defensible (force means force), but a user reaching for "clean start, then fast iteration" gets a full rebuild on each keystroke-triggered change instead — worth a deliberate decision.
Technical details
# `--force` is sticky across every watch rebuild
## Affected sites
- `packages/leadtype/src/cli/generate.ts` — `runGenerateWatch` → `rerun` calls `executeGenerate(args, io)` with the same `args.force` on every change; `resolveConvertCache` sets `force: true` each run, so `previousEntries` is always `{}`.
## Required outcome
- Decide whether `--force` should apply only to the initial watch build (then fall back to incremental) or intentionally to every rebuild. Current behavior is the latter.
## Suggested approach (optional)
- If one-shot force is desired: clear `args.force` (or pass a per-run flag) after the first `executeGenerate` in `runGenerateWatch`, leaving subsequent reruns incremental.
## Open questions for the human
- Is "force every rebuild in watch" intended, or should force be first-build-only? The CLI docs currently say "restart with `--force`", which reads as one-shot.ℹ️ No incremental test for type-table source invalidation
The changeset calls out type-table TypeScript sources as a newly tracked dependency, and the plugin now reports them through onDependency. The incremental test suite covers source edits, include-target edits, pruning, force, fingerprint, and corrupt-manifest recovery — but not the type-table path, so a regression in getFileDependencyReporter or the extractTypeFromFile dependency reporting would pass CI silently.
Technical details
# No incremental test for type-table source invalidation
## Affected sites
- `packages/leadtype/src/convert/incremental.test.ts` — has "rebuilds a file when its include target changes" but no equivalent for a type-table `.ts` source change.
- `packages/leadtype/src/markdown/plugins/type-table.ts:236-254`, `798-826` — the new dependency-reporting code paths (fresh extraction and cached-extraction branch) are untested for cache invalidation.
## Required outcome
- Add an incremental test asserting that editing a `.ts` file consumed by a type table rebuilds the page that renders it (and leaves an unrelated page cached), mirroring the existing include-target test.ℹ️ Nitpicks
packages/leadtype/src/convert/incremental.ts:125—saveConvertCacheManifestuses a fixed temp path${filePath}.tmp; two concurrentgenerateprocesses on one project (e.g. a--watchsession plus a manual run) could race on it. Impact is bounded (a corrupt manifest just triggers a rebuild), but a pid- ormkdtemp-suffixed temp name would remove the race.
Claude Opus | 𝕏
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/leadtype/src/cli/generate.ts (1)
2629-2722: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick winWatch session can't detect a docs directory created after startup.
When
executeGeneratefails because a configured docs directory doesn't exist yet, the returnedwatchPathsstill includes that nonexistent path.watchInputs(seewatch.ts) silently skips watching paths that don't exist at arm-time, so if the user subsequently creates the missing docs directory, no change event fires for it — only edits to the config file (if it resolved) would trigger a re-arm. The watch session is then stuck requiring a manual restart, with no in-app hint.🩹 Suggested mitigation: fall back to watching the nearest existing ancestor
+function nearestExistingAncestor(target: string): string { + let dir = target; + while (!existsSync(dir)) { + const parent = path.dirname(dir); + if (parent === dir) { + return dir; + } + dir = parent; + } + return dir; +}Use
nearestExistingAncestor(candidate)when buildingwatchPathsfor any docs directory that doesn't yet exist, so its parent's creation events are observed and can trigger a re-check.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/leadtype/src/cli/generate.ts` around lines 2629 - 2722, The watch path selection in executeGenerate returns a nonexistent docs directory directly, so watchInputs never arms a watcher for it and later creation is missed. Update executeGenerate to detect missing docsDirs and replace each missing path in watchPaths with nearestExistingAncestor(candidate) so the parent directory creation can retrigger the generate flow. Keep the change localized around the docs directory existence check and watchPaths assembly, using nearestExistingAncestor and executeGenerate as the key symbols.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/leadtype/src/cli/generate.ts`:
- Around line 2432-2483: The incremental cache is only enabled when
resolveConvertCache finds a node_modules directly under srcDir, which disables
caching for hoisted workspace layouts. Update resolveConvertCache in generate.ts
to locate the nearest usable node_modules by walking up from opts.srcDir
(Node-style resolution) instead of checking only the immediate child, while
keeping the existing cacheId/fingerprint logic and the ConvertCacheOptions file
path generation intact.
- Around line 2547-2607: The watch rerun path in runGenerateWatch keeps reusing
the same args object, so --force remains enabled for every rebuild and bypasses
the cache indefinitely. Update the watch flow so the force flag is only honored
for the initial executeGenerate call, then clear or override args.force before
subsequent reruns triggered by scheduleRerun/rerun. Make the change in
runGenerateWatch and keep the behavior aligned with resolveConvertCache so
incremental watch rebuilds can use the cache after the first pass.
In `@packages/leadtype/src/convert/convert.ts`:
- Around line 1357-1372: The dependency hashing in convert() is still done
sequentially inside the cacheState.nextEntries block, which unnecessarily
serializes I/O for dependencyPaths. Update the hashing logic near deps and
hashFileCached to mirror depsUnchanged by collecting all dependency hash
promises and awaiting them together with Promise.all, then build the deps record
from the resolved results while preserving the empty-string fallback for
unreadable or missing files.
- Around line 1270-1284: The cache manifest handling in convert.ts is dropping
prune history whenever config.cache.force is set or loadConvertCacheManifest
returns null on a fingerprint mismatch, which causes stale outputs to survive
forever. Update the cacheState setup so prune candidates are loaded
independently of force/fingerprint invalidation, and keep a separate history
source for pruneStaleOutputs instead of reusing previousEntries for both reuse
checks and pruning. Add a regression test in incremental.test.ts covering a
deleted source whose output is pruned across a force run or
fingerprint-invalidated run.
In `@packages/leadtype/src/convert/incremental.test.ts`:
- Around line 1-246: Add a test in incremental.test.ts that covers pruning after
a force run or fingerprint change using createCacheProject and runConvert. The
issue is that the current prune test only exercises the normal cache path, so it
misses the case where cacheState.previousEntries is cleared by force/fingerprint
invalidation and deleted outputs may not be removed. Reuse the existing helpers
(runConvert, tamper, existsSync, rm) to delete a source after an initial
conversion, rerun with force or a different fingerprint, and assert the deleted
file’s output is pruned while the remaining output stays intact.
In `@packages/leadtype/src/convert/incremental.ts`:
- Around line 79-116: The manifest validator in isManifestShape only checks that
entries exists as an object, so malformed ConvertCacheEntry data can slip
through and later crash pruneStaleOutputs when it reads entry.output. Tighten
isManifestShape in incremental.ts to validate every entry’s shape, including
required string/object fields such as sourceHash, deps, enrichment, and output,
and only return true when all entries match ConvertCacheEntry. Keep
loadConvertCacheManifest’s behavior unchanged so invalid manifests still fall
back to null and a full rebuild.
---
Outside diff comments:
In `@packages/leadtype/src/cli/generate.ts`:
- Around line 2629-2722: The watch path selection in executeGenerate returns a
nonexistent docs directory directly, so watchInputs never arms a watcher for it
and later creation is missed. Update executeGenerate to detect missing docsDirs
and replace each missing path in watchPaths with
nearestExistingAncestor(candidate) so the parent directory creation can
retrigger the generate flow. Keep the change localized around the docs directory
existence check and watchPaths assembly, using nearestExistingAncestor and
executeGenerate as the key symbols.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 487a745a-9c1b-47a1-b20d-6a660a72251e
📒 Files selected for processing (9)
.changeset/watch-mode-incremental-builds.mddocs/reference/cli.mdxpackages/leadtype/src/cli/generate.tspackages/leadtype/src/cli/watch.test.tspackages/leadtype/src/cli/watch.tspackages/leadtype/src/convert/convert.tspackages/leadtype/src/convert/incremental.test.tspackages/leadtype/src/convert/incremental.tspackages/leadtype/src/markdown/plugins/type-table.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{ts,tsx}: Use explicit types for function parameters and return values when they enhance clarity
Preferunknownoveranywhen the type is genuinely unknown
Use const assertions (as const) for immutable values and literal types
Leverage TypeScript's type narrowing instead of type assertions
Files:
packages/leadtype/src/cli/watch.test.tspackages/leadtype/src/cli/watch.tspackages/leadtype/src/convert/incremental.test.tspackages/leadtype/src/convert/incremental.tspackages/leadtype/src/markdown/plugins/type-table.tspackages/leadtype/src/convert/convert.tspackages/leadtype/src/cli/generate.ts
**/*.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{js,ts,jsx,tsx}: Use meaningful variable names instead of magic numbers - extract constants with descriptive names
Use arrow functions for callbacks and short functions
Preferfor...ofloops over.forEach()and indexedforloops
Use optional chaining (?.) and nullish coalescing (??) for safer property access
Prefer template literals over string concatenation
Use destructuring for object and array assignments
Useconstby default,letonly when reassignment is needed, nevervar
Alwaysawaitpromises in async functions - don't forget to use the return value
Useasync/awaitsyntax instead of promise chains for better readability
Handle errors appropriately in async code with try-catch blocks
Don't use async functions as Promise executors
Removeconsole.log,debugger, andalertstatements from production code
ThrowErrorobjects with descriptive messages, not strings or other values
Usetry-catchblocks meaningfully - don't catch errors just to rethrow them
Prefer early returns over nested conditionals for error cases
Extract complex conditions into well-named boolean variables
Use early returns to reduce nesting
Prefer simple conditionals over nested ternary operators
Don't useeval()or assign directly todocument.cookie
Avoid spread syntax in accumulators within loops
Use top-level regex literals instead of creating them in loops
Prefer specific imports over namespace imports
Use descriptive names for functions, variables, and types for meaningful naming
Add comments for complex logic, but prefer self-documenting code
Files:
packages/leadtype/src/cli/watch.test.tspackages/leadtype/src/cli/watch.tspackages/leadtype/src/convert/incremental.test.tspackages/leadtype/src/convert/incremental.tspackages/leadtype/src/markdown/plugins/type-table.tspackages/leadtype/src/convert/convert.tspackages/leadtype/src/cli/generate.ts
**/*.{test,spec}.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{test,spec}.{js,ts,jsx,tsx}: Write assertions insideit()ortest()blocks
Avoid done callbacks in async tests - use async/await instead
Don't use.onlyor.skipin committed code
Keep test suites reasonably flat - avoid excessivedescribenesting
Files:
packages/leadtype/src/cli/watch.test.tspackages/leadtype/src/convert/incremental.test.ts
🧠 Learnings (1)
📚 Learning: 2026-06-09T18:30:08.038Z
Learnt from: KayleeWilliams
Repo: inthhq/leadtype PR: 97
File: .changeset/search-prototype-safety-and-scaling.md:5-5
Timestamp: 2026-06-09T18:30:08.038Z
Learning: In this repo, `.changeset/*.md` files must not start the body with an H1/first-line heading (`#`) immediately after the YAML frontmatter. The changesets tool inlines the body as bullet entries into `CHANGELOG.md` during release, and a leading `#` heading would break the generated changelog format. As a result, MD041 (`first-line-heading`) warnings for files under `.changeset/` are expected false positives and should be ignored.
Applied to files:
.changeset/watch-mode-incremental-builds.md
🪛 ast-grep (0.44.0)
packages/leadtype/src/convert/convert.ts
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
🪛 markdownlint-cli2 (0.22.1)
.changeset/watch-mode-incremental-builds.md
[warning] 5-5: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
🔇 Additional comments (13)
.changeset/watch-mode-incremental-builds.md (1)
1-12: 📐 Maintainability & Code QualityMD041 warning is an expected false positive for this repo's changeset files.
Based on learnings,
.changeset/*.mdfiles intentionally must not start with an H1 heading immediately after frontmatter, since the changesets tool inlines the body as bullet entries intoCHANGELOG.md; a leading#would break that format. The static-analysis MD041 hint here should be ignored.Source: Learnings
packages/leadtype/src/cli/watch.ts (1)
1-133: LGTM!packages/leadtype/src/cli/watch.test.ts (1)
1-158: LGTM!packages/leadtype/src/cli/generate.ts (5)
1-11: LGTM!Also applies to: 77-77
183-186: LGTM!Also applies to: 354-366, 396-397, 445-448
2401-2413: LGTM!
2536-2540: LGTM!Also applies to: 2609-2634, 2694-2699, 2721-2721, 2779-2789, 3161-3165
2414-2430: 🗄️ Data Integrity & IntegrationDrop the self-reference warning for
leadtype/package.json.packages/leadtype/package.jsonalready exports"./package.json", socreateRequire(import.meta.url)can resolve it and this fallback won’t be triggered by package exports alone.> Likely an incorrect or invalid review comment.docs/reference/cli.mdx (1)
60-79: LGTM!packages/leadtype/src/convert/convert.ts (2)
1458-1497: 🎯 Functional CorrectnessCovered by the pruning-history comment above.
The
entry.outputtraversal-guard here is solid, but this function's inputs need the fix proposed on Lines 1270-1284 (using history-preserving prune candidates and per-entry validation) to avoid silently losing prune opportunities and to avoid crashing on malformed manifest entries.
3-3: LGTM!Also applies to: 37-47, 347-352, 374-379, 842-852, 1286-1356, 1401-1452
packages/leadtype/src/convert/incremental.ts (1)
1-77: LGTM!Also applies to: 118-147
packages/leadtype/src/markdown/plugins/type-table.ts (1)
192-212: LGTM!Also applies to: 236-255, 910-932, 1091-1113
| /** | ||
| * Build the incremental-cache options for this run, or undefined when there | ||
| * is no conventional cache home (`node_modules/`) to write into. | ||
| * | ||
| * The fingerprint invalidates the whole cache when anything the per-file | ||
| * checks can't see changes: the leadtype version, the docs config file | ||
| * (transformers/flatteners are functions, so the config's content hash | ||
| * stands in for them), or flags that alter conversion output. | ||
| */ | ||
| async function resolveConvertCache(opts: { | ||
| srcDir: string; | ||
| outDir: string; | ||
| args: GenerateArgs; | ||
| configPath?: string; | ||
| }): Promise<ConvertCacheOptions | undefined> { | ||
| const nodeModulesDir = path.join(opts.srcDir, "node_modules"); | ||
| if (!existsSync(nodeModulesDir)) { | ||
| return; | ||
| } | ||
| const cacheId = hashText( | ||
| JSON.stringify({ | ||
| outDir: opts.outDir, | ||
| docsDirs: opts.args.docsDirs, | ||
| include: opts.args.include, | ||
| exclude: opts.args.exclude, | ||
| bundle: opts.args.bundle, | ||
| }) | ||
| ).slice(0, 16); | ||
| let configHash = "no-config"; | ||
| if (opts.configPath) { | ||
| try { | ||
| configHash = hashText(await readFile(opts.configPath, "utf8")); | ||
| } catch { | ||
| configHash = "unreadable-config"; | ||
| } | ||
| } | ||
| const fingerprint = [ | ||
| readLeadtypeVersion(), | ||
| configHash, | ||
| JSON.stringify({ enrichGit: opts.args.enrichGit }), | ||
| ].join("|"); | ||
| return { | ||
| file: path.join( | ||
| nodeModulesDir, | ||
| ".cache", | ||
| "leadtype", | ||
| `generate-${cacheId}.json` | ||
| ), | ||
| fingerprint, | ||
| ...(opts.args.force ? { force: true } : {}), | ||
| }; | ||
| } |
There was a problem hiding this comment.
🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win
Cache silently disabled for hoisted monorepo node_modules layouts.
resolveConvertCache only checks existsSync(path.join(opts.srcDir, "node_modules")). In workspace setups where dependencies are hoisted and --src points at a subpackage without its own node_modules, incremental caching is disabled entirely, even though a usable node_modules exists at an ancestor directory. This is documented as expected behavior in cli.mdx, but a short upward walk (similar to Node's own module resolution) would extend the feature's coverage to more repo layouts.
♻️ Suggested upward-walk helper
+function findNearestNodeModules(startDir: string): string | undefined {
+ let dir = startDir;
+ for (;;) {
+ const candidate = path.join(dir, "node_modules");
+ if (existsSync(candidate)) {
+ return candidate;
+ }
+ const parent = path.dirname(dir);
+ if (parent === dir) {
+ return undefined;
+ }
+ dir = parent;
+ }
+}
+
async function resolveConvertCache(opts: {
srcDir: string;
outDir: string;
args: GenerateArgs;
configPath?: string;
}): Promise<ConvertCacheOptions | undefined> {
- const nodeModulesDir = path.join(opts.srcDir, "node_modules");
- if (!existsSync(nodeModulesDir)) {
+ const nodeModulesDir = findNearestNodeModules(opts.srcDir);
+ if (!nodeModulesDir) {
return;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /** | |
| * Build the incremental-cache options for this run, or undefined when there | |
| * is no conventional cache home (`node_modules/`) to write into. | |
| * | |
| * The fingerprint invalidates the whole cache when anything the per-file | |
| * checks can't see changes: the leadtype version, the docs config file | |
| * (transformers/flatteners are functions, so the config's content hash | |
| * stands in for them), or flags that alter conversion output. | |
| */ | |
| async function resolveConvertCache(opts: { | |
| srcDir: string; | |
| outDir: string; | |
| args: GenerateArgs; | |
| configPath?: string; | |
| }): Promise<ConvertCacheOptions | undefined> { | |
| const nodeModulesDir = path.join(opts.srcDir, "node_modules"); | |
| if (!existsSync(nodeModulesDir)) { | |
| return; | |
| } | |
| const cacheId = hashText( | |
| JSON.stringify({ | |
| outDir: opts.outDir, | |
| docsDirs: opts.args.docsDirs, | |
| include: opts.args.include, | |
| exclude: opts.args.exclude, | |
| bundle: opts.args.bundle, | |
| }) | |
| ).slice(0, 16); | |
| let configHash = "no-config"; | |
| if (opts.configPath) { | |
| try { | |
| configHash = hashText(await readFile(opts.configPath, "utf8")); | |
| } catch { | |
| configHash = "unreadable-config"; | |
| } | |
| } | |
| const fingerprint = [ | |
| readLeadtypeVersion(), | |
| configHash, | |
| JSON.stringify({ enrichGit: opts.args.enrichGit }), | |
| ].join("|"); | |
| return { | |
| file: path.join( | |
| nodeModulesDir, | |
| ".cache", | |
| "leadtype", | |
| `generate-${cacheId}.json` | |
| ), | |
| fingerprint, | |
| ...(opts.args.force ? { force: true } : {}), | |
| }; | |
| } | |
| /** | |
| * Build the incremental-cache options for this run, or undefined when there | |
| * is no conventional cache home (`node_modules/`) to write into. | |
| * | |
| * The fingerprint invalidates the whole cache when anything the per-file | |
| * checks can't see changes: the leadtype version, the docs config file | |
| * (transformers/flatteners are functions, so the config's content hash | |
| * stands in for them), or flags that alter conversion output. | |
| */ | |
| function findNearestNodeModules(startDir: string): string | undefined { | |
| let dir = startDir; | |
| for (;;) { | |
| const candidate = path.join(dir, "node_modules"); | |
| if (existsSync(candidate)) { | |
| return candidate; | |
| } | |
| const parent = path.dirname(dir); | |
| if (parent === dir) { | |
| return undefined; | |
| } | |
| dir = parent; | |
| } | |
| } | |
| async function resolveConvertCache(opts: { | |
| srcDir: string; | |
| outDir: string; | |
| args: GenerateArgs; | |
| configPath?: string; | |
| }): Promise<ConvertCacheOptions | undefined> { | |
| const nodeModulesDir = findNearestNodeModules(opts.srcDir); | |
| if (!nodeModulesDir) { | |
| return; | |
| } | |
| const cacheId = hashText( | |
| JSON.stringify({ | |
| outDir: opts.outDir, | |
| docsDirs: opts.args.docsDirs, | |
| include: opts.args.include, | |
| exclude: opts.args.exclude, | |
| bundle: opts.args.bundle, | |
| }) | |
| ).slice(0, 16); | |
| let configHash = "no-config"; | |
| if (opts.configPath) { | |
| try { | |
| configHash = hashText(await readFile(opts.configPath, "utf8")); | |
| } catch { | |
| configHash = "unreadable-config"; | |
| } | |
| } | |
| const fingerprint = [ | |
| readLeadtypeVersion(), | |
| configHash, | |
| JSON.stringify({ enrichGit: opts.args.enrichGit }), | |
| ].join("|"); | |
| return { | |
| file: path.join( | |
| nodeModulesDir, | |
| ".cache", | |
| "leadtype", | |
| `generate-${cacheId}.json` | |
| ), | |
| fingerprint, | |
| ...(opts.args.force ? { force: true } : {}), | |
| }; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/leadtype/src/cli/generate.ts` around lines 2432 - 2483, The
incremental cache is only enabled when resolveConvertCache finds a node_modules
directly under srcDir, which disables caching for hoisted workspace layouts.
Update resolveConvertCache in generate.ts to locate the nearest usable
node_modules by walking up from opts.srcDir (Node-style resolution) instead of
checking only the immediate child, while keeping the existing
cacheId/fingerprint logic and the ConvertCacheOptions file path generation
intact.
| async function runGenerateWatch( | ||
| args: GenerateArgs, | ||
| io: GenerateIo | ||
| ): Promise<number> { | ||
| const outDir = path.resolve(args.outDir); | ||
| let outcome = await executeGenerate(args, io); | ||
| let watcher: ReturnType<typeof watchInputs> | undefined; | ||
| let running = false; | ||
| let queued = false; | ||
|
|
||
| const armWatcher = (): ReturnType<typeof watchInputs> => | ||
| watchInputs({ | ||
| paths: outcome.watchPaths, | ||
| ignorePaths: [outDir], | ||
| onChange: (changedPaths) => { | ||
| logger.info({ | ||
| human: { | ||
| message: `change detected (${changedPaths.length} path${changedPaths.length === 1 ? "" : "s"}) — regenerating`, | ||
| }, | ||
| json: { | ||
| event: "generate.watch_rebuild", | ||
| fields: { changed: changedPaths.slice(0, 20) }, | ||
| }, | ||
| }); | ||
| scheduleRerun(); | ||
| }, | ||
| }); | ||
|
|
||
| const scheduleRerun = (): void => { | ||
| rerun().catch((error: unknown) => { | ||
| const message = error instanceof Error ? error.message : String(error); | ||
| logger.error({ | ||
| human: { message: `watch rebuild failed: ${message}` }, | ||
| json: { event: "generate.watch_error", fields: { message } }, | ||
| }); | ||
| }); | ||
| }; | ||
|
|
||
| const rerun = async (): Promise<void> => { | ||
| if (running) { | ||
| queued = true; | ||
| return; | ||
| } | ||
| running = true; | ||
| try { | ||
| const previousWatchPaths = outcome.watchPaths.join("\0"); | ||
| outcome = await executeGenerate(args, io); | ||
| // A config edit can add or remove docs sources — re-arm so the new | ||
| // set is observed. | ||
| if (outcome.watchPaths.join("\0") !== previousWatchPaths) { | ||
| watcher?.close(); | ||
| watcher = armWatcher(); | ||
| } | ||
| } finally { | ||
| running = false; | ||
| if (queued) { | ||
| queued = false; | ||
| scheduleRerun(); | ||
| } | ||
| } | ||
| }; |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
--force bypasses the cache on every watch-triggered rebuild, not just the initial run.
args is never mutated between reruns, so combining --watch --force causes resolveConvertCache to set force: true on every single rerun triggered by scheduleRerun, defeating the fast incremental-rebuild benefit that watch mode is meant to provide after the first build. Consider clearing args.force after the first successful run, or documenting this combination's behavior explicitly in GENERATE_USAGE/cli.mdx.
🩹 Suggested one-line fix
const outDir = path.resolve(args.outDir);
let outcome = await executeGenerate(args, io);
+ // Only force the very first build; watch-triggered rebuilds should stay
+ // incremental even when the session started with --force.
+ args = { ...args, force: false };
let watcher: ReturnType<typeof watchInputs> | undefined;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async function runGenerateWatch( | |
| args: GenerateArgs, | |
| io: GenerateIo | |
| ): Promise<number> { | |
| const outDir = path.resolve(args.outDir); | |
| let outcome = await executeGenerate(args, io); | |
| let watcher: ReturnType<typeof watchInputs> | undefined; | |
| let running = false; | |
| let queued = false; | |
| const armWatcher = (): ReturnType<typeof watchInputs> => | |
| watchInputs({ | |
| paths: outcome.watchPaths, | |
| ignorePaths: [outDir], | |
| onChange: (changedPaths) => { | |
| logger.info({ | |
| human: { | |
| message: `change detected (${changedPaths.length} path${changedPaths.length === 1 ? "" : "s"}) — regenerating`, | |
| }, | |
| json: { | |
| event: "generate.watch_rebuild", | |
| fields: { changed: changedPaths.slice(0, 20) }, | |
| }, | |
| }); | |
| scheduleRerun(); | |
| }, | |
| }); | |
| const scheduleRerun = (): void => { | |
| rerun().catch((error: unknown) => { | |
| const message = error instanceof Error ? error.message : String(error); | |
| logger.error({ | |
| human: { message: `watch rebuild failed: ${message}` }, | |
| json: { event: "generate.watch_error", fields: { message } }, | |
| }); | |
| }); | |
| }; | |
| const rerun = async (): Promise<void> => { | |
| if (running) { | |
| queued = true; | |
| return; | |
| } | |
| running = true; | |
| try { | |
| const previousWatchPaths = outcome.watchPaths.join("\0"); | |
| outcome = await executeGenerate(args, io); | |
| // A config edit can add or remove docs sources — re-arm so the new | |
| // set is observed. | |
| if (outcome.watchPaths.join("\0") !== previousWatchPaths) { | |
| watcher?.close(); | |
| watcher = armWatcher(); | |
| } | |
| } finally { | |
| running = false; | |
| if (queued) { | |
| queued = false; | |
| scheduleRerun(); | |
| } | |
| } | |
| }; | |
| async function runGenerateWatch( | |
| args: GenerateArgs, | |
| io: GenerateIo | |
| ): Promise<number> { | |
| const outDir = path.resolve(args.outDir); | |
| let outcome = await executeGenerate(args, io); | |
| // Only force the very first build; watch-triggered rebuilds should stay | |
| // incremental even when the session started with --force. | |
| args = { ...args, force: false }; | |
| let watcher: ReturnType<typeof watchInputs> | undefined; | |
| let running = false; | |
| let queued = false; | |
| const armWatcher = (): ReturnType<typeof watchInputs> => | |
| watchInputs({ | |
| paths: outcome.watchPaths, | |
| ignorePaths: [outDir], | |
| onChange: (changedPaths) => { | |
| logger.info({ | |
| human: { | |
| message: `change detected (${changedPaths.length} path${changedPaths.length === 1 ? "" : "s"}) — regenerating`, | |
| }, | |
| json: { | |
| event: "generate.watch_rebuild", | |
| fields: { changed: changedPaths.slice(0, 20) }, | |
| }, | |
| }); | |
| scheduleRerun(); | |
| }, | |
| }); | |
| const scheduleRerun = (): void => { | |
| rerun().catch((error: unknown) => { | |
| const message = error instanceof Error ? error.message : String(error); | |
| logger.error({ | |
| human: { message: `watch rebuild failed: ${message}` }, | |
| json: { event: "generate.watch_error", fields: { message } }, | |
| }); | |
| }); | |
| }; | |
| const rerun = async (): Promise<void> => { | |
| if (running) { | |
| queued = true; | |
| return; | |
| } | |
| running = true; | |
| try { | |
| const previousWatchPaths = outcome.watchPaths.join("\0"); | |
| outcome = await executeGenerate(args, io); | |
| // A config edit can add or remove docs sources — re-arm so the new | |
| // set is observed. | |
| if (outcome.watchPaths.join("\0") !== previousWatchPaths) { | |
| watcher?.close(); | |
| watcher = armWatcher(); | |
| } | |
| } finally { | |
| running = false; | |
| if (queued) { | |
| queued = false; | |
| scheduleRerun(); | |
| } | |
| } | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/leadtype/src/cli/generate.ts` around lines 2547 - 2607, The watch
rerun path in runGenerateWatch keeps reusing the same args object, so --force
remains enabled for every rebuild and bypasses the cache indefinitely. Update
the watch flow so the force flag is only honored for the initial executeGenerate
call, then clear or override args.force before subsequent reruns triggered by
scheduleRerun/rerun. Make the change in runGenerateWatch and keep the behavior
aligned with resolveConvertCache so incremental watch rebuilds can use the cache
after the first pass.
| const cacheState = config.cache | ||
| ? { | ||
| options: config.cache, | ||
| previousEntries: config.cache.force | ||
| ? {} | ||
| : (( | ||
| await loadConvertCacheManifest( | ||
| config.cache.file, | ||
| config.cache.fingerprint | ||
| ) | ||
| )?.entries ?? {}), | ||
| nextEntries: {} as Record<string, ConvertCacheEntry>, | ||
| hashCache: createFileHashCache(), | ||
| } | ||
| : undefined; |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift
Pruning history is permanently lost whenever the cache is invalidated (--force or a fingerprint change).
previousEntries is used for two different purposes: (1) deciding whether a file can be reused, and (2) knowing which previously-recorded outputs to prune via pruneStaleOutputs (called later at line ~1408 with this same previousEntries). When config.cache.force is true, or whenever loadConvertCacheManifest returns null due to a fingerprint mismatch (e.g. a leadtype version bump — see incremental.ts:109), previousEntries becomes {}.
Since nextEntries (built from only the current source files) is what gets written back to the manifest at the end of the run, any entry for a file that was deleted before this force/fingerprint-invalidated run is dropped from the manifest with no chance to prune its stale output — and that history is now gone forever, since the next manifest write no longer contains it. In practice: any leadtype version bump combined with a previously-removed doc leaves an orphaned .md file in outDir permanently.
♻️ Proposed fix: load prune candidates independently of `force`/fingerprint
+// in incremental.ts
+export async function loadConvertCacheEntriesForPrune(
+ filePath: string
+): Promise<Record<string, ConvertCacheEntry>> {
+ let raw: string;
+ try {
+ raw = await readFile(filePath, "utf8");
+ } catch {
+ return {};
+ }
+ try {
+ const parsed: unknown = JSON.parse(raw);
+ if (isManifestShape(parsed)) {
+ return parsed.entries;
+ }
+ } catch {
+ // Corrupt manifest — nothing to prune from.
+ }
+ return {};
+} const cacheState = config.cache
? {
options: config.cache,
previousEntries: config.cache.force
? {}
: ((
await loadConvertCacheManifest(
config.cache.file,
config.cache.fingerprint
)
)?.entries ?? {}),
+ pruneEntries: await loadConvertCacheEntriesForPrune(config.cache.file),
nextEntries: {} as Record<string, ConvertCacheEntry>,
hashCache: createFileHashCache(),
}
: undefined; pruned = await pruneStaleOutputs(
- cacheState.previousEntries,
+ cacheState.pruneEntries,
cacheState.nextEntries,
currentKeys,
outDir
);Also worth adding a regression test in incremental.test.ts for "prunes a deleted source's output across a force/fingerprint-invalidated run" — none of the current tests combine deletion with force/fingerprint changes.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const cacheState = config.cache | |
| ? { | |
| options: config.cache, | |
| previousEntries: config.cache.force | |
| ? {} | |
| : (( | |
| await loadConvertCacheManifest( | |
| config.cache.file, | |
| config.cache.fingerprint | |
| ) | |
| )?.entries ?? {}), | |
| nextEntries: {} as Record<string, ConvertCacheEntry>, | |
| hashCache: createFileHashCache(), | |
| } | |
| : undefined; | |
| const cacheState = config.cache | |
| ? { | |
| options: config.cache, | |
| previousEntries: config.cache.force | |
| ? {} | |
| : (( | |
| await loadConvertCacheManifest( | |
| config.cache.file, | |
| config.cache.fingerprint | |
| ) | |
| )?.entries ?? {}), | |
| pruneEntries: await loadConvertCacheEntriesForPrune(config.cache.file), | |
| nextEntries: {} as Record<string, ConvertCacheEntry>, | |
| hashCache: createFileHashCache(), | |
| } | |
| : undefined; |
🧰 Tools
🪛 ast-grep (0.44.0)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/leadtype/src/convert/convert.ts` around lines 1270 - 1284, The cache
manifest handling in convert.ts is dropping prune history whenever
config.cache.force is set or loadConvertCacheManifest returns null on a
fingerprint mismatch, which causes stale outputs to survive forever. Update the
cacheState setup so prune candidates are loaded independently of
force/fingerprint invalidation, and keep a separate history source for
pruneStaleOutputs instead of reusing previousEntries for both reuse checks and
pruning. Add a regression test in incremental.test.ts covering a deleted source
whose output is pruned across a force run or fingerprint-invalidated run.
| if (cacheState && sourceHash) { | ||
| const deps: Record<string, string> = {}; | ||
| for (const dependencyPath of dependencyPaths ?? []) { | ||
| // Unreadable/missing deps record an empty hash: the entry stays | ||
| // invalid until the file appears, so a broken reference that gets | ||
| // fixed later is picked up without --force. | ||
| deps[dependencyPath] = | ||
| (await hashFileCached(dependencyPath, cacheState.hashCache)) ?? ""; | ||
| } | ||
| cacheState.nextEntries[cacheKey] = { | ||
| sourceHash, | ||
| deps, | ||
| enrichment: enrichmentKey, | ||
| output: normalizeRelativePath(relative(outDir, outputPath)), | ||
| }; | ||
| } |
There was a problem hiding this comment.
🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win
Sequential dependency hashing — consider parallelizing like depsUnchanged.
Each dependency path is hashed one-at-a-time via a for...of + await loop, unlike depsUnchanged which hashes all deps in parallel via Promise.all. For files with many dependencies (e.g. type-table pages pulling in several TypeScript sources), this serializes I/O unnecessarily.
⚡ Proposed fix
if (cacheState && sourceHash) {
- const deps: Record<string, string> = {};
- for (const dependencyPath of dependencyPaths ?? []) {
- // Unreadable/missing deps record an empty hash: the entry stays
- // invalid until the file appears, so a broken reference that gets
- // fixed later is picked up without --force.
- deps[dependencyPath] =
- (await hashFileCached(dependencyPath, cacheState.hashCache)) ?? "";
- }
+ const deps = Object.fromEntries(
+ await Promise.all(
+ Array.from(dependencyPaths ?? [], async (dependencyPath) => [
+ dependencyPath,
+ // Unreadable/missing deps record an empty hash: the entry
+ // stays invalid until the file appears, so a broken reference
+ // that gets fixed later is picked up without --force.
+ (await hashFileCached(dependencyPath, cacheState.hashCache)) ??
+ "",
+ ])
+ )
+ );
cacheState.nextEntries[cacheKey] = {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (cacheState && sourceHash) { | |
| const deps: Record<string, string> = {}; | |
| for (const dependencyPath of dependencyPaths ?? []) { | |
| // Unreadable/missing deps record an empty hash: the entry stays | |
| // invalid until the file appears, so a broken reference that gets | |
| // fixed later is picked up without --force. | |
| deps[dependencyPath] = | |
| (await hashFileCached(dependencyPath, cacheState.hashCache)) ?? ""; | |
| } | |
| cacheState.nextEntries[cacheKey] = { | |
| sourceHash, | |
| deps, | |
| enrichment: enrichmentKey, | |
| output: normalizeRelativePath(relative(outDir, outputPath)), | |
| }; | |
| } | |
| if (cacheState && sourceHash) { | |
| const deps = Object.fromEntries( | |
| await Promise.all( | |
| Array.from(dependencyPaths ?? [], async (dependencyPath) => [ | |
| dependencyPath, | |
| // Unreadable/missing deps record an empty hash: the entry | |
| // stays invalid until the file appears, so a broken reference | |
| // that gets fixed later is picked up without --force. | |
| (await hashFileCached(dependencyPath, cacheState.hashCache)) ?? | |
| "", | |
| ]) | |
| ) | |
| ); | |
| cacheState.nextEntries[cacheKey] = { | |
| sourceHash, | |
| deps, | |
| enrichment: enrichmentKey, | |
| output: normalizeRelativePath(relative(outDir, outputPath)), | |
| }; | |
| } |
🧰 Tools
🪛 ast-grep (0.44.0)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { execFile } from "node:child_process";
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/leadtype/src/convert/convert.ts` around lines 1357 - 1372, The
dependency hashing in convert() is still done sequentially inside the
cacheState.nextEntries block, which unnecessarily serializes I/O for
dependencyPaths. Update the hashing logic near deps and hashFileCached to mirror
depsUnchanged by collecting all dependency hash promises and awaiting them
together with Promise.all, then build the deps record from the resolved results
while preserving the empty-string fallback for unreadable or missing files.
| import { existsSync } from "node:fs"; | ||
| import { mkdir, mkdtemp, readFile, rm, writeFile } from "node:fs/promises"; | ||
| import { tmpdir } from "node:os"; | ||
| import path from "node:path"; | ||
| import { afterEach, describe, expect, it } from "vitest"; | ||
| import { includeMarkdown } from "../markdown"; | ||
| import { convertAllMdx } from "./convert"; | ||
| import { loadConvertCacheManifest } from "./incremental"; | ||
|
|
||
| const tempDirs: string[] = []; | ||
| const TAMPER_MARKER = "<!-- tampered by test -->"; | ||
|
|
||
| afterEach(async () => { | ||
| await Promise.all( | ||
| tempDirs.splice(0).map(async (dir) => { | ||
| await rm(dir, { recursive: true, force: true }); | ||
| }) | ||
| ); | ||
| }); | ||
|
|
||
| type CacheProject = { | ||
| root: string; | ||
| srcDir: string; | ||
| outDir: string; | ||
| cacheFile: string; | ||
| }; | ||
|
|
||
| async function createCacheProject(): Promise<CacheProject> { | ||
| const root = await mkdtemp(path.join(tmpdir(), "leadtype-incremental-")); | ||
| tempDirs.push(root); | ||
| const srcDir = path.join(root, "docs"); | ||
| const outDir = path.join(root, "out"); | ||
| await mkdir(srcDir, { recursive: true }); | ||
| return { | ||
| root, | ||
| srcDir, | ||
| outDir, | ||
| cacheFile: path.join(root, "cache", "convert.json"), | ||
| }; | ||
| } | ||
|
|
||
| function runConvert( | ||
| project: CacheProject, | ||
| overrides: { fingerprint?: string; force?: boolean } = {} | ||
| ): Promise<void> { | ||
| return convertAllMdx({ | ||
| srcDir: project.srcDir, | ||
| outDir: project.outDir, | ||
| markdownTransforms: [includeMarkdown], | ||
| cache: { | ||
| file: project.cacheFile, | ||
| fingerprint: overrides.fingerprint ?? "test-fingerprint", | ||
| ...(overrides.force ? { force: true } : {}), | ||
| }, | ||
| }); | ||
| } | ||
|
|
||
| async function tamper(outputPath: string): Promise<void> { | ||
| const current = await readFile(outputPath, "utf8"); | ||
| await writeFile(outputPath, `${current}\n${TAMPER_MARKER}\n`); | ||
| } | ||
|
|
||
| async function isTampered(outputPath: string): Promise<boolean> { | ||
| const current = await readFile(outputPath, "utf8"); | ||
| return current.includes(TAMPER_MARKER); | ||
| } | ||
|
|
||
| describe("convertAllMdx incremental cache", () => { | ||
| it("skips unchanged files on the second run", async () => { | ||
| const project = await createCacheProject(); | ||
| await writeFile( | ||
| path.join(project.srcDir, "alpha.mdx"), | ||
| "---\ntitle: Alpha\n---\n\n# Alpha\n" | ||
| ); | ||
| await writeFile( | ||
| path.join(project.srcDir, "beta.mdx"), | ||
| "---\ntitle: Beta\n---\n\n# Beta\n" | ||
| ); | ||
|
|
||
| await runConvert(project); | ||
| const alphaOut = path.join(project.outDir, "alpha.md"); | ||
| const betaOut = path.join(project.outDir, "beta.md"); | ||
| expect(existsSync(alphaOut)).toBe(true); | ||
| expect(existsSync(betaOut)).toBe(true); | ||
|
|
||
| // A cached skip must leave the existing output untouched — tampering the | ||
| // outputs makes a rewrite observable. | ||
| await tamper(alphaOut); | ||
| await tamper(betaOut); | ||
| await runConvert(project); | ||
| expect(await isTampered(alphaOut)).toBe(true); | ||
| expect(await isTampered(betaOut)).toBe(true); | ||
| }); | ||
|
|
||
| it("rebuilds only the edited file", async () => { | ||
| const project = await createCacheProject(); | ||
| const alphaSrc = path.join(project.srcDir, "alpha.mdx"); | ||
| await writeFile(alphaSrc, "---\ntitle: Alpha\n---\n\n# Alpha\n"); | ||
| await writeFile( | ||
| path.join(project.srcDir, "beta.mdx"), | ||
| "---\ntitle: Beta\n---\n\n# Beta\n" | ||
| ); | ||
|
|
||
| await runConvert(project); | ||
| const alphaOut = path.join(project.outDir, "alpha.md"); | ||
| const betaOut = path.join(project.outDir, "beta.md"); | ||
| await tamper(alphaOut); | ||
| await tamper(betaOut); | ||
|
|
||
| await writeFile(alphaSrc, "---\ntitle: Alpha\n---\n\n# Alpha edited\n"); | ||
| await runConvert(project); | ||
|
|
||
| expect(await isTampered(alphaOut)).toBe(false); | ||
| expect(await readFile(alphaOut, "utf8")).toContain("Alpha edited"); | ||
| expect(await isTampered(betaOut)).toBe(true); | ||
| }); | ||
|
|
||
| it("rebuilds a file when its include target changes", async () => { | ||
| const project = await createCacheProject(); | ||
| const snippetPath = path.join(project.srcDir, "snippet.mdx"); | ||
| await writeFile(snippetPath, "Shared snippet v1.\n"); | ||
| await writeFile( | ||
| path.join(project.srcDir, "page.mdx"), | ||
| '---\ntitle: Page\n---\n\n<include src="./snippet.mdx" />\n' | ||
| ); | ||
| await writeFile( | ||
| path.join(project.srcDir, "other.mdx"), | ||
| "---\ntitle: Other\n---\n\n# Other\n" | ||
| ); | ||
|
|
||
| await runConvert(project); | ||
| const pageOut = path.join(project.outDir, "page.md"); | ||
| const otherOut = path.join(project.outDir, "other.md"); | ||
| expect(await readFile(pageOut, "utf8")).toContain("Shared snippet v1."); | ||
| await tamper(pageOut); | ||
| await tamper(otherOut); | ||
|
|
||
| await writeFile(snippetPath, "Shared snippet v2.\n"); | ||
| await runConvert(project); | ||
|
|
||
| expect(await readFile(pageOut, "utf8")).toContain("Shared snippet v2."); | ||
| expect(await isTampered(pageOut)).toBe(false); | ||
| expect(await isTampered(otherOut)).toBe(true); | ||
| }); | ||
|
|
||
| it("records include targets as dependencies in the manifest", async () => { | ||
| const project = await createCacheProject(); | ||
| const snippetPath = path.join(project.srcDir, "snippet.mdx"); | ||
| await writeFile(snippetPath, "Shared snippet.\n"); | ||
| await writeFile( | ||
| path.join(project.srcDir, "page.mdx"), | ||
| '---\ntitle: Page\n---\n\n<include src="./snippet.mdx" />\n' | ||
| ); | ||
|
|
||
| await runConvert(project); | ||
| const manifest = await loadConvertCacheManifest( | ||
| project.cacheFile, | ||
| "test-fingerprint" | ||
| ); | ||
| expect(manifest).not.toBeNull(); | ||
| const entry = manifest?.entries["page.mdx"]; | ||
| expect(entry).toBeDefined(); | ||
| expect(Object.keys(entry?.deps ?? {})).toContain(snippetPath); | ||
| }); | ||
|
|
||
| it("prunes the output of a deleted source file", async () => { | ||
| const project = await createCacheProject(); | ||
| const alphaSrc = path.join(project.srcDir, "alpha.mdx"); | ||
| await writeFile(alphaSrc, "---\ntitle: Alpha\n---\n\n# Alpha\n"); | ||
| await writeFile( | ||
| path.join(project.srcDir, "beta.mdx"), | ||
| "---\ntitle: Beta\n---\n\n# Beta\n" | ||
| ); | ||
|
|
||
| await runConvert(project); | ||
| const alphaOut = path.join(project.outDir, "alpha.md"); | ||
| expect(existsSync(alphaOut)).toBe(true); | ||
|
|
||
| await rm(alphaSrc); | ||
| await runConvert(project); | ||
|
|
||
| expect(existsSync(alphaOut)).toBe(false); | ||
| expect(existsSync(path.join(project.outDir, "beta.md"))).toBe(true); | ||
| }); | ||
|
|
||
| it("rebuilds everything with force", async () => { | ||
| const project = await createCacheProject(); | ||
| await writeFile( | ||
| path.join(project.srcDir, "alpha.mdx"), | ||
| "---\ntitle: Alpha\n---\n\n# Alpha\n" | ||
| ); | ||
|
|
||
| await runConvert(project); | ||
| const alphaOut = path.join(project.outDir, "alpha.md"); | ||
| await tamper(alphaOut); | ||
|
|
||
| await runConvert(project, { force: true }); | ||
| expect(await isTampered(alphaOut)).toBe(false); | ||
| }); | ||
|
|
||
| it("rebuilds everything when the fingerprint changes", async () => { | ||
| const project = await createCacheProject(); | ||
| await writeFile( | ||
| path.join(project.srcDir, "alpha.mdx"), | ||
| "---\ntitle: Alpha\n---\n\n# Alpha\n" | ||
| ); | ||
|
|
||
| await runConvert(project); | ||
| const alphaOut = path.join(project.outDir, "alpha.md"); | ||
| await tamper(alphaOut); | ||
|
|
||
| await runConvert(project, { fingerprint: "different-fingerprint" }); | ||
| expect(await isTampered(alphaOut)).toBe(false); | ||
| }); | ||
|
|
||
| it("regenerates a missing output even when the source is unchanged", async () => { | ||
| const project = await createCacheProject(); | ||
| await writeFile( | ||
| path.join(project.srcDir, "alpha.mdx"), | ||
| "---\ntitle: Alpha\n---\n\n# Alpha\n" | ||
| ); | ||
|
|
||
| await runConvert(project); | ||
| const alphaOut = path.join(project.outDir, "alpha.md"); | ||
| await rm(alphaOut); | ||
|
|
||
| await runConvert(project); | ||
| expect(existsSync(alphaOut)).toBe(true); | ||
| }); | ||
|
|
||
| it("survives a corrupt manifest by rebuilding", async () => { | ||
| const project = await createCacheProject(); | ||
| await writeFile( | ||
| path.join(project.srcDir, "alpha.mdx"), | ||
| "---\ntitle: Alpha\n---\n\n# Alpha\n" | ||
| ); | ||
|
|
||
| await runConvert(project); | ||
| await writeFile(project.cacheFile, "{not json"); | ||
| const alphaOut = path.join(project.outDir, "alpha.md"); | ||
| await tamper(alphaOut); | ||
|
|
||
| await runConvert(project); | ||
| expect(await isTampered(alphaOut)).toBe(false); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win
Good coverage of the core cache hit/miss/prune/force/fingerprint/corruption paths.
Missing test: pruning across a force/fingerprint-invalidated run. All prune-related assertions (Lines 166-184) run with the same fingerprint and without force, so they don't exercise the scenario where previousEntries is emptied by force/fingerprint change (see the major issue raised in convert.ts around cacheState initialization) — a deleted source's output would remain un-pruned and this suite wouldn't catch it.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/leadtype/src/convert/incremental.test.ts` around lines 1 - 246, Add
a test in incremental.test.ts that covers pruning after a force run or
fingerprint change using createCacheProject and runConvert. The issue is that
the current prune test only exercises the normal cache path, so it misses the
case where cacheState.previousEntries is cleared by force/fingerprint
invalidation and deleted outputs may not be removed. Reuse the existing helpers
(runConvert, tamper, existsSync, rm) to delete a source after an initial
conversion, rerun with force or a different fingerprint, and assert the deleted
file’s output is pruned while the remaining output stays intact.
| function isManifestShape(value: unknown): value is ConvertCacheManifest { | ||
| if (typeof value !== "object" || value === null) { | ||
| return false; | ||
| } | ||
| const manifest = value as Record<string, unknown>; | ||
| return ( | ||
| manifest.version === CONVERT_CACHE_VERSION && | ||
| typeof manifest.fingerprint === "string" && | ||
| typeof manifest.entries === "object" && | ||
| manifest.entries !== null | ||
| ); | ||
| } | ||
|
|
||
| /** | ||
| * Read a manifest, returning null when it's missing, unreadable, from a | ||
| * different cache version, or built with a different fingerprint — every | ||
| * miss degrades to a full rebuild, never an error. | ||
| */ | ||
| export async function loadConvertCacheManifest( | ||
| filePath: string, | ||
| fingerprint: string | ||
| ): Promise<ConvertCacheManifest | null> { | ||
| let raw: string; | ||
| try { | ||
| raw = await readFile(filePath, "utf8"); | ||
| } catch { | ||
| return null; | ||
| } | ||
| try { | ||
| const parsed: unknown = JSON.parse(raw); | ||
| if (isManifestShape(parsed) && parsed.fingerprint === fingerprint) { | ||
| return parsed; | ||
| } | ||
| } catch { | ||
| // Corrupt manifest — treat as absent. | ||
| } | ||
| return null; | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
isManifestShape only validates top-level fields, not per-entry shape — malformed entries can crash pruning.
isManifestShape checks that entries is a non-null object but never validates that each ConvertCacheEntry actually has the expected sourceHash/deps/enrichment/output string/object fields. A manifest that's syntactically valid JSON but has a partially-written or hand-edited entry (e.g. missing output) will pass this check and be accepted as a real manifest.
Downstream, pruneStaleOutputs (convert.ts) calls resolve(outDir, entry.output) on every entry that isn't in the current key set — if entry.output is undefined/non-string, path.resolve throws TypeError, crashing the whole convertAllMdx() run. This directly contradicts the module's own documented guarantee that manifest issues should "degrade to a full rebuild, never an error."
🛡️ Proposed fix: validate entry shape in `isManifestShape`
function isManifestShape(value: unknown): value is ConvertCacheManifest {
if (typeof value !== "object" || value === null) {
return false;
}
const manifest = value as Record<string, unknown>;
- return (
+ if (
manifest.version === CONVERT_CACHE_VERSION &&
typeof manifest.fingerprint === "string" &&
typeof manifest.entries === "object" &&
manifest.entries !== null
- );
+ ) {
+ return Object.values(
+ manifest.entries as Record<string, unknown>
+ ).every(
+ (entry) =>
+ typeof entry === "object" &&
+ entry !== null &&
+ typeof (entry as ConvertCacheEntry).sourceHash === "string" &&
+ typeof (entry as ConvertCacheEntry).output === "string" &&
+ typeof (entry as ConvertCacheEntry).enrichment === "string" &&
+ typeof (entry as ConvertCacheEntry).deps === "object" &&
+ (entry as ConvertCacheEntry).deps !== null
+ );
+ }
+ return false;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function isManifestShape(value: unknown): value is ConvertCacheManifest { | |
| if (typeof value !== "object" || value === null) { | |
| return false; | |
| } | |
| const manifest = value as Record<string, unknown>; | |
| return ( | |
| manifest.version === CONVERT_CACHE_VERSION && | |
| typeof manifest.fingerprint === "string" && | |
| typeof manifest.entries === "object" && | |
| manifest.entries !== null | |
| ); | |
| } | |
| /** | |
| * Read a manifest, returning null when it's missing, unreadable, from a | |
| * different cache version, or built with a different fingerprint — every | |
| * miss degrades to a full rebuild, never an error. | |
| */ | |
| export async function loadConvertCacheManifest( | |
| filePath: string, | |
| fingerprint: string | |
| ): Promise<ConvertCacheManifest | null> { | |
| let raw: string; | |
| try { | |
| raw = await readFile(filePath, "utf8"); | |
| } catch { | |
| return null; | |
| } | |
| try { | |
| const parsed: unknown = JSON.parse(raw); | |
| if (isManifestShape(parsed) && parsed.fingerprint === fingerprint) { | |
| return parsed; | |
| } | |
| } catch { | |
| // Corrupt manifest — treat as absent. | |
| } | |
| return null; | |
| } | |
| function isManifestShape(value: unknown): value is ConvertCacheManifest { | |
| if (typeof value !== "object" || value === null) { | |
| return false; | |
| } | |
| const manifest = value as Record<string, unknown>; | |
| if ( | |
| manifest.version === CONVERT_CACHE_VERSION && | |
| typeof manifest.fingerprint === "string" && | |
| typeof manifest.entries === "object" && | |
| manifest.entries !== null | |
| ) { | |
| return Object.values(manifest.entries as Record<string, unknown>).every( | |
| (entry) => | |
| typeof entry === "object" && | |
| entry !== null && | |
| typeof (entry as ConvertCacheEntry).sourceHash === "string" && | |
| typeof (entry as ConvertCacheEntry).output === "string" && | |
| typeof (entry as ConvertCacheEntry).enrichment === "string" && | |
| typeof (entry as ConvertCacheEntry).deps === "object" && | |
| (entry as ConvertCacheEntry).deps !== null | |
| ); | |
| } | |
| return false; | |
| } | |
| /** | |
| * Read a manifest, returning null when it's missing, unreadable, from a | |
| * different cache version, or built with a different fingerprint — every | |
| * miss degrades to a full rebuild, never an error. | |
| */ | |
| export async function loadConvertCacheManifest( | |
| filePath: string, | |
| fingerprint: string | |
| ): Promise<ConvertCacheManifest | null> { | |
| let raw: string; | |
| try { | |
| raw = await readFile(filePath, "utf8"); | |
| } catch { | |
| return null; | |
| } | |
| try { | |
| const parsed: unknown = JSON.parse(raw); | |
| if (isManifestShape(parsed) && parsed.fingerprint === fingerprint) { | |
| return parsed; | |
| } | |
| } catch { | |
| // Corrupt manifest — treat as absent. | |
| } | |
| return null; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/leadtype/src/convert/incremental.ts` around lines 79 - 116, The
manifest validator in isManifestShape only checks that entries exists as an
object, so malformed ConvertCacheEntry data can slip through and later crash
pruneStaleOutputs when it reads entry.output. Tighten isManifestShape in
incremental.ts to validate every entry’s shape, including required string/object
fields such as sourceHash, deps, enrichment, and output, and only return true
when all entries match ConvertCacheEntry. Keep loadConvertCacheManifest’s
behavior unchanged so invalid manifests still fall back to null and a full
rebuild.

--watchoption forleadtype generate, enabling automatic rebuilds on changes to documentation sources and configuration files.cacheoption inconvertAllMdx, allowing for dependency tracking and cache invalidation.These changes significantly enhance the efficiency and usability of the
leadtypeCLI for documentation generation.