Skip to content

perf(extensions): skip re-bundling pulled extensions on cold catalog#1389

Merged
stack72 merged 3 commits into
mainfrom
paul/trust-pulled-bundle-cache
May 15, 2026
Merged

perf(extensions): skip re-bundling pulled extensions on cold catalog#1389
stack72 merged 3 commits into
mainfrom
paul/trust-pulled-bundle-cache

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented May 15, 2026

Summary

  • When the extension catalog is stale, the reconcile service was spawning a deno bundle subprocess for every pulled extension file — even when cached .js bundles already existed on disk
  • For @swamp/aws/ec2 (107 model types), this added ~27s of pure waste on cold start
  • Fix: trust the cached bundle for pulled extensions during reconciliation, skip the subprocess

Benchmark (107 types, 20 models, 40 data items)

Scenario Before After
Cold catalog 33s 5.7s
Warm runs 0.85s 0.85s

Safety

  • Only applies to pulled extensions (from registry, not user-written)
  • Only in cold-catalog reconciliation and bootstrap paths
  • Warm-path stale detection (findStaleFiles) still re-bundles when fingerprints change
  • UAT regression test added in systeminit/swamp-uat

Test Plan

  • 5937 unit/integration tests pass
  • Extension-specific integration tests pass (pull, kind sync, freshness, rebundle, state)
  • UAT regression test validates types resolve after catalog wipe

🤖 Generated with Claude Code

stack72 and others added 3 commits May 14, 2026 22:18
…nd workflow run

Previously, OTEL traces only covered the outer CLI, method-run, and
workflow-run envelopes plus the inner driver execution — leaving
60-80% of wall-clock time invisible. This adds spans for the
phases that execute in those gaps:

CLI bootstrap:
- swamp.cli.bootstrap (telemetry init → command tree construction)
- swamp.cli.configure_extension_loaders (catalog setup)
- swamp.cli.teardown (datastore sync + telemetry flush)

Method run:
- swamp.model.method.setup (logging → driver resolution)
- swamp.model.method.evaluate_expressions (CEL evaluation)
- swamp.model.method.resolve_runtime (vault/env resolution)
- swamp.model.method.pre_execute (hash + output save + driver config)
- swamp.model.method.post_execute (data artifacts + reports)
- swamp.model.method.reports (report execution)

Workflow run:
- swamp.workflow.setup (marker load → run creation)
- swamp.workflow.build_context (model resolver buildContext)
- swamp.workflow.teardown (post-job completion)
- swamp.workflow.reports (workflow-scope report execution)
- swamp.workflow.save_run (run persistence)

All spans use try/finally to ensure they end on error paths.
Zero overhead when tracing is disabled (no-op tracer).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the extension catalog is wiped (after upgrade, corruption, or first
run in a new worktree), the reconcile service previously spawned a Deno
bundle subprocess for every pulled extension file — even when cached .js
bundles already existed on disk from the original `extension pull`.

For a repo with @swamp/aws/ec2 (107 model types), this added ~27s of
pure waste on cold start (107 subprocess spawns × ~250ms each).

Fix: add a `trustPulledCache` flag to `bundleWithCache` that skips the
`deno bundle` subprocess when a cached bundle exists for a pulled
extension. The flag is only passed from:
- `reconcileExtension()` when `originType === "pulled"`
- `load()` cold-catalog bootstrap path

The warm-path stale-file detection (`findStaleFiles` → `rebundleAndUpdateCatalog`)
does NOT pass the flag, so fingerprint-based change detection still
triggers re-bundling when source files are modified.

Benchmark (107 extension types, 20 models, 40 data items):
- Cold catalog before: 33s
- Cold catalog after:  5.7s
- Warm runs: unchanged (0.85s)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

None.

Verdict

PASS — This PR touches only internal bundling logic (extension_loader.ts, reconcile_from_disk_service.ts). No commands, renderers, help text, error messages, or output formatters are changed. Users will see the same CLI behavior, just faster (33 s → 5.7 s cold start for large extension catalogs). Nothing to flag from a UX perspective.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Clean, well-scoped performance fix. The change correctly threads trustPulledCache through bundleWithCachebundleAndIndexOneReconcileFromDiskService, short-circuiting the isExpectedBundleFailure filesystem walk and deno bundle subprocess for pulled extensions that already have cached bundles on disk.

Blocking Issues

None.

Suggestions

  1. Unit test for trustPulledCache pathextension_loader_test.ts has no coverage of the new trustPulledCache option. A test that sets up a pulled extension with an existing cached bundle and asserts bundleWithCache returns { fromCache: true } without invoking the subprocess would make this path regression-safe. The change is low-risk and covered by integration/UAT tests, so not blocking.

Notes

  • No any types, no fire-and-forget promises, no libswamp import boundary violations.
  • isExpectedBundleFailure is a heuristic about bare specifiers / missing deno.json — not a security gate — so bypassing it for pulled extensions with existing bundles is safe.
  • Fingerprint checks in reconcileExtension (lines 430-437) still run before the bundle path, so changed sources are re-bundled regardless of trustPulledCache.
  • The option stays internal to ExtensionLoader and ReconcileFromDiskServicebundleAndIndexOne isn't exported through src/libswamp/mod.ts, so no public API impact.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. src/domain/extensions/extension_loader.ts:156trustPulledCache: true is hardcoded in load(), which processes both pulled and user-written extensions.
    The load() method at line 100 is called from src/cli/mod.ts:579 with mixed directories (absoluteVaultsDir + user source dirs + pulled dirs). Hardcoding trustPulledCache: true looks overly broad at first glance. However, this is safe in practice because bundleWithCache at line 962 gates on the isPulled path check (boundaryDir must be under pulled-extensions/), so user-written extensions never hit the short-circuit. The flag is effectively a no-op for non-pulled files. Still, passing true unconditionally is misleading — a future reader might assume it affects all files. Consider threading the origin info through load() like the reconcile service does, or adding a comment explaining why the blanket true is safe.

Low

  1. src/domain/extensions/extension_loader.ts:940 — Optional parameter bag on internal method. Adding options?: { trustPulledCache?: boolean } to bundleWithCache means three of the four call sites now pass it and one (rebundleAndUpdateCatalog at line 795) doesn't, relying on the default undefined. This is fine and the default behavior (fall through to the subprocess) is correct for the warm-path rebundle case. Just noting the asymmetry — no action needed.

  2. Theoretical staleness window. If a pulled extension's .ts source were updated on disk without a corresponding .js bundle update, trustPulledCache would serve the stale bundle. I verified that the pull operation (src/libswamp/extensions/pull.ts:425-427) extracts pre-built bundles alongside sources, so they're always in sync. The only way to hit this would be manual filesystem tampering, which is outside the threat model.

Verdict

PASS — The change is minimal, well-scoped, and correct. The isPulled path check in bundleWithCache prevents the optimization from ever applying to user-written extensions. The pull pipeline guarantees source/bundle consistency for pulled extensions. The warm-path stale detection is untouched. The 6x cold-start improvement with no correctness trade-off is a clean win.

@stack72 stack72 merged commit ff430b1 into main May 15, 2026
11 checks passed
@stack72 stack72 deleted the paul/trust-pulled-bundle-cache branch May 15, 2026 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant