Skip to content

feat(pipelines): add git-archive built-in for local monorepo sources#2549

Draft
astrojerms wants to merge 7 commits into
chainguard-dev:mainfrom
astrojerms:feat/git-archive-local-source
Draft

feat(pipelines): add git-archive built-in for local monorepo sources#2549
astrojerms wants to merge 7 commits into
chainguard-dev:mainfrom
astrojerms:feat/git-archive-local-source

Conversation

@astrojerms

Copy link
Copy Markdown
Member

Add a git-archive pipeline that populates the build workspace from a subtree of the local git repository at a pinned ref, host-side, without cloning over the network and without mounting the repository's .git history into the sandbox.

Intended for monorepos where the package definition and the source it packages live in the same repository. By default it archives the requested path at the commit melange is building, so the packaged source is exactly the source as it existed in that commit. An explicit ref/tag is still supported and is verified against expected-commit.

The archive runs before the sandbox starts: melange resolves the commit, verifies it, and extracts only the requested path into a temp dir that becomes the source dir. The resolved commit is recorded back into the step so SBOM provenance reflects the exact source even when the manifest omits a ref. Falls back to the git CLI to resolve HEAD when go-git cannot determine the build commit (e.g. in a linked worktree).

Melange Pull Request Template

Functional Changes

  • This change can build all of Wolfi without errors (describe results in notes)

Notes:

SCA Changes

  • Examining several representative APKs show no regression / the desired effect (details in notes)

Notes:

Linter

  • The new check is clean across Wolfi
  • The new check is opt-in or a warning

Notes:

Add a git-archive pipeline that populates the build workspace from a
subtree of the local git repository at a pinned ref, host-side, without
cloning over the network and without mounting the repository's .git
history into the sandbox.

Intended for monorepos where the package definition and the source it
packages live in the same repository. By default it archives the
requested path at the commit melange is building, so the packaged source
is exactly the source as it existed in that commit. An explicit ref/tag
is still supported and is verified against expected-commit.

The archive runs before the sandbox starts: melange resolves the commit,
verifies it, and extracts only the requested path into a temp dir that
becomes the source dir. The resolved commit is recorded back into the
step so SBOM provenance reflects the exact source even when the manifest
omits a ref. Falls back to the git CLI to resolve HEAD when go-git cannot
determine the build commit (e.g. in a linked worktree).
The git and tar commands are constructed from melange build configuration
inputs (path/ref) and a melange-created temp dir, the same trust model as
the existing git-checkout and patch pipelines. Annotate with #nosec G204
to satisfy the linter, matching the convention used elsewhere in the tree.
Replace the single 'ref' input with 'tag' and 'branch' (mutually
exclusive), matching the established git-checkout pipeline vocabulary for
familiarity and to align with wolfictl's bump handling, which keys off
'tag' and 'expected-commit'. When neither tag nor branch is given, the
build commit is still used (and its assurance), preserving the bare
co-located form.
Archives a committed fixture subtree from the local repository at the
build commit and verifies the files land in the workspace with their
path prefix preserved, and that files outside the archived subtree are
not present. Uses the build-commit default (no tag/branch) so it works
on the shallow CI checkout without relying on tags or history.
- gitArchive: always wait on tar and errors.Join both commands so the
  originating failure is surfaced and the tar child is never leaked (was
  returning on archive failure without reaping tar, and masking tar
  failures behind git's EPIPE).
- maybeGitArchiveSource: hard-error on more than one git-archive step or
  on placement in a nested/subpackage pipeline (was silently ignoring
  all but the first). Error when combined with an empty workspace.
- Preflight host git/tar with exec.LookPath for a clear error, since the
  archive runs host-side (the pipeline 'needs' only covers the sandbox).
- Honor and document .gitattributes (export-ignore/export-subst): soften
  the 'exactly as committed' wording and note the export-subst content
  consideration in multi-contributor repos.
- Share config.UnknownCommit instead of the bare "unknown" sentinel.
- Return a non-nil no-op cleanup func so callers can defer unconditionally.
- Comment the SBOM write-back's per-build-config safety and the
  intentional double-copy trade-off; drop duplicate log line.
- Tests: export-ignore omission, tag and branch refs, and the
  multiple/misplaced git-archive guards.
…guard

- pass -- / --end-of-options to git so config-supplied path/ref cannot
  inject options (host-side arbitrary file write via --output)
- branch + expected-commit now matches git-checkout: expected-commit may
  be an older commit on the branch and is what gets archived
- reject git-archive in test pipelines (melange test never archives)
- warn when archiving HEAD with uncommitted changes under path
- drop redundant re-substitution; Compile already resolves step.With
@astrojerms astrojerms force-pushed the feat/git-archive-local-source branch from 436725c to 94e2476 Compare July 2, 2026 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant