Skip to content

feat: Incremental Build#1267

Draft
RandomByte wants to merge 227 commits intomainfrom
feat/incremental-build-4
Draft

feat: Incremental Build#1267
RandomByte wants to merge 227 commits intomainfrom
feat/incremental-build-4

Conversation

@RandomByte
Copy link
Copy Markdown
Member

Implementation of RFC 0017 Incremental Build

This PR supersedes previous PoC: #1238

JIRA: CPOUI5FOUNDATION-1174

@RandomByte RandomByte marked this pull request as draft January 7, 2026 12:28
@coveralls
Copy link
Copy Markdown

coveralls commented Jan 7, 2026

Coverage Status

coverage: 99.566% (+4.9%) from 94.658%
when pulling b727ebc on feat/incremental-build-4
into cb29ec1 on main.

@RandomByte RandomByte force-pushed the feat/incremental-build-4 branch 3 times, most recently from 5224cd2 to 4904c84 Compare January 9, 2026 09:22
@RandomByte RandomByte force-pushed the feat/incremental-build-4 branch from 950fc6d to 41eed91 Compare January 14, 2026 15:28
@maxreichmann maxreichmann force-pushed the feat/incremental-build-4 branch from bb39565 to 2a21507 Compare January 20, 2026 10:01
@RandomByte RandomByte force-pushed the feat/incremental-build-4 branch 3 times, most recently from 6233816 to f858659 Compare January 20, 2026 16:58
@RandomByte RandomByte force-pushed the feat/incremental-build-4 branch from 71db1d0 to a2c371f Compare January 26, 2026 09:54
@RandomByte RandomByte force-pushed the feat/incremental-build-4 branch from 7364b4b to cf43f0c Compare February 9, 2026 13:32
@maxreichmann maxreichmann force-pushed the feat/incremental-build-4 branch 2 times, most recently from 252b966 to 874a943 Compare February 16, 2026 17:17
Comment thread packages/project/test/fixtures/application.a/custom-tasks/custom-task-2.js Outdated
Comment thread packages/project/test/fixtures/application.a/custom-tasks/custom-task-0.js Outdated
@maxreichmann maxreichmann force-pushed the feat/incremental-build-4 branch 5 times, most recently from df275e5 to 0345502 Compare February 27, 2026 10:34
Comment thread packages/project/test/fixtures/application.a/task.dependency-change.js Outdated
@RandomByte RandomByte force-pushed the feat/incremental-build-4 branch from 9fc2509 to 20ba653 Compare March 5, 2026 16:23
@maxreichmann maxreichmann force-pushed the feat/incremental-build-4 branch from 20ba653 to 9fc2509 Compare March 5, 2026 16:34
@RandomByte RandomByte force-pushed the feat/incremental-build-4 branch from 9fc2509 to 940376d Compare March 5, 2026 16:40
@RandomByte RandomByte force-pushed the feat/incremental-build-4 branch from 9197670 to 44d1107 Compare March 20, 2026 15:05
@maxreichmann maxreichmann force-pushed the feat/incremental-build-4 branch from 44d1107 to d7c402c Compare March 26, 2026 14:05
@maxreichmann
Copy link
Copy Markdown
Member

maxreichmann commented Mar 26, 2026

Rebased onto origin/main

Comment thread internal/e2e-tests/test/build.js Fixed
Comment thread internal/e2e-tests/test/version.js Fixed
…ing cache writes

Track integrity hashes from restored stage metadata in a Set and skip
cacache.get.info() calls for resources already known to be in CAS.
Reduces cache write time from ~1,400ms to ~100ms for stale-cache builds
by eliminating ~15,000 redundant CAS existence checks.
… import

On first CLI invocation, #importStages treated all restored stage
resources as "changed" because #currentStageSignatures was empty.
This caused ~3000+ resource paths to be propagated to dependents,
triggering expensive updateDependencyIndices calls (~108ms total).

The imported stages represent the already-cached state, not actual
changes. Skip writtenResourcePaths accumulation when this is the
initial import (empty #currentStageSignatures), since dependents'
dependency indices were restored from the same persistent cache.
…pagated

When restoring from cache, dependency indices are already populated
via BuildTaskCache.fromCache. If no dependency changes were propagated
from upstream projects, the cached indices are already correct and
_refreshDependencyIndices can be skipped entirely. This avoids
fetching ~2738 resources per dependent just to confirm nothing changed,
saving ~130ms on warm-cache builds.
…esult, and source freeze

Add perf-level timing to previously unlogged operations that dominate
stale-cache build time:

- allTasksCompleted: overall timing + sub-timings for
  revalidateSourceIndex and freezeUntransformedSources
- revalidateSourceIndex: byGlob timing for source file re-read
- freezeUntransformedSources: byPath reads and writeStageResources
- recordTaskResult: overall timing + delta merge and recordRequests

Investigation of a stale-cache sap.m build (1 file changed) showed
allTasksCompleted taking 11.3s of a 12s build, with
freezeUntransformedSources accounting for 10.2s due to per-resource
CAS existence checks.
In #freezeUntransformedSources, retain the previous build's frozen source
resourceMetadata and reuse entries for untransformed paths that haven't
changed, avoiding byPath reads and metadata collection for ~12K resources.

For sap.m stale-cache builds (1 file changed), this reduces
#freezeUntransformedSources from ~1,212ms to ~31ms.
Cover the RESTORING_DEPENDENCY_INDICES state handling in
prepareProjectBuildAndValidateCache, verifying that:

- _refreshDependencyIndices is skipped when no dependency changes
  were propagated (warm cache scenario)
- dependencyResourcesChanged() moves state to REQUIRES_UPDATE,
  routing changes through #flushPendingChanges instead
- State correctly transitions from RESTORING_DEPENDENCY_INDICES
  to FRESH after the first prepareProjectBuildAndValidateCache call
- Subsequent dependency changes use the REQUIRES_UPDATE path
Add skill with architecture reference covering class hierarchy,
Resource content model, adapter internals, collection patterns,
and resourceFactory API.
Replace cacache's content-addressable storage with a lightweight custom
implementation that computes content paths synchronously from integrity
hashes, eliminating ~5 seconds of cacache.get.info() index lookups per
build when writing ~14k resources to dist.

Key changes:
- New ContentAddressableStorage class with synchronous contentPath()
  computation, gzip-compressed storage, and atomic writes
- CacheManager now delegates to CAS module instead of cacache
- ProjectBuildCache resolves CAS paths synchronously (no PassThrough
  bridge needed in createStream factory)
- CAS-backed resources get sourceMetadata {adapter: "CAS"} to prepare
  for future write-path optimizations
- Bump CACHE_VERSION to v0_3 (breaking: old caches are invalidated)

BREAKING CHANGE: Build cache version bumped from v0_2 to v0_3.
Existing build caches will be rebuilt on first run.
Skip the PassThrough intermediary stream when writing CAS-backed resources
to disk. Instead, pipe the resource stream directly to the write stream,
eliminating one pipe hop and one stream object per resource while maintaining
stream backpressure for I/O throttling at scale.
Cover the previously untested delta merge logic (lines 798-848) that
executes when recordTaskResult receives a cacheInfo object from a delta
cache hit. Five new tests verify resource merging, tag import, tag merge
precedence, signature passthrough, and getCachedWriter fallback.
Introduce MetadataStorage class backed by node:sqlite (DatabaseSync)
to replace individual JSON file reads/writes in CacheManager. All four
metadata types (index cache, stage metadata, task metadata, result
metadata) are stored as JSON blobs in a single SQLite database with
composite primary keys, WAL mode, and prepared statements.

CacheManager remains the public interface, delegating metadata
operations to MetadataStorage and binary content to CAS. Unused
readBuildManifest/writeBuildManifest methods removed. CACHE_VERSION
bumped to v0_4.
Store gzip-compressed resource content as BLOBs in a SQLite database
(content.db) instead of individual files in a directory tree. This
eliminates per-file overhead (access, mkdir, writeFile+rename) and
enables batch writes within a single transaction.

Benchmarks on sap.m (11,893 untransformed sources):
- Cold cache writeStageResources: 4605ms → 2915ms (-37%)
- Cold cache total build: 25s → 22s (-12%)
- Warm cache: unchanged (~35ms)

Bumps cache version to v0_5.
Wrap metadata writes in explicit SQLite transactions (BEGIN/COMMIT)
to reduce per-statement WAL sync overhead, mirroring the existing
batch pattern in ContentAddressableStorageSQLite. ProjectBuildCache
.writeCache() now wraps all metadata writes in a single transaction
with automatic rollback on error.
Replace ContentAddressableStorageSQLite and MetadataStorage with a single
BuildCacheStorage class backed by one SQLite database (cache.db). Remove
the dead file-based ContentAddressableStorage.

Content batches use SAVEPOINTs when nested inside metadata batches,
preserving the existing independent rollback semantics with a single DB
connection. Bump CACHE_VERSION to v0_6.
- page_size=32768: Reduces overflow page chains for compressed blobs
  (120 KB source index: 15 pages → 4 pages)
- mmap_size=256MB: Eliminates pread() syscalls via memory-mapped I/O
- cache_size=64MB: Keeps more pages cached for sequential access
- Bumps cache version to v0_7 (page_size requires fresh database)
…rehashing

Replace recursive _computeHash with non-recursive _recomputeDirectoryHashShallow
in ancestor rehash loops. The recursive method recomputed all descendants
(~16,000 SHA-256 ops for root in sap.m), while the shallow method only combines
existing child hashes (4 ops for a single resource change).

Measured improvement: 28ms → 0.36ms (79x) for single-resource delta in sap.m
(12,677 resources).
… code

Remove unused _recomputeAncestorHashes method (zero callers). Add tests
verifying that _recomputeDirectoryHashShallow produces correct hashes at
every directory level for multi-depth upserts, sibling modifications,
batch removals, and deep leaf changes.
… existence check

Reduce gzip compression level from 6 to 1 in BuildCacheStorage.putContent(),
yielding ~3-4x faster compression at the cost of ~15-25% larger cache.db.

Add findExistingContentIntegrities() batch check to skip compression and
INSERT for content already present in the database. This benefits
multi-project builds where shared resources may already be stored.
…ia async gzip

Replace synchronous gzipSync inside the SQLite transaction with async
parallel compression using the libuv thread pool. The writeStageResources
method now:
1. Gathers integrity + buffer (existing Phase 1)
2. Batch-checks existing content (existing Phase 1.5)
3. Compresses all buffers in parallel via async gzip with bounded
   concurrency (new Phase 2)
4. Batch-inserts pre-compressed data in a short transaction (new Phase 3)

This separates CPU-bound compression from I/O, letting multiple cores
contribute to compression simultaneously while keeping the SQLite
transaction window minimal (only fast synchronous INSERTs).
…e I/O

When getIntegrity() is called on a FACTORY resource that has a
createBufferFactory, materialize the buffer and compute integrity from
it. Subsequent getBuffer() calls return the cached buffer with zero
additional file I/O, eliminating redundant readFile calls in the
build cache writeStageResources flow.
Resources <= 128 bytes are stored uncompressed in the CAS, avoiding
gzip overhead that exceeds the compression benefit for tiny inputs.
The read path uses gzip magic byte detection for backward compatibility.
@RandomByte RandomByte force-pushed the feat/incremental-build-4 branch from e1d781f to ab65f27 Compare April 30, 2026 12:19
Build tests: add getRootPath stub and cacheDir to expected args to
match the new cache directory parameter in graph.build().

Serve tests: adapt to the non-returning handler pattern (pOnError)
by using fire-and-settle instead of awaiting handler completion,
and account for the new error callback passed to serverServe.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants