Skip to content

Add support for cfsctl oci apply-delta#305

Open
alexlarsson wants to merge 1 commit into
mainfrom
delta-apply
Open

Add support for cfsctl oci apply-delta#305
alexlarsson wants to merge 1 commit into
mainfrom
delta-apply

Conversation

@alexlarsson

Copy link
Copy Markdown
Contributor

This allows applying an oci-delta tarfile to create an oci image based on a delta and a previous oci image in the repo.

For details on oci-delta, see: https://github.com/containers/oci-delta

@alexlarsson alexlarsson force-pushed the delta-apply branch 2 times, most recently from f1c29de to d7d70f2 Compare June 2, 2026 15:01

@cgwalters cgwalters left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an initial style pass

Comment thread crates/composefs-oci/src/delta.rs Outdated
Comment thread crates/composefs-oci/src/delta.rs Outdated
Comment thread crates/composefs-oci/src/delta.rs Outdated
Comment thread crates/composefs-oci/src/delta.rs Outdated
Comment thread crates/composefs-oci/src/delta.rs
Comment thread crates/composefs-oci/src/delta.rs Outdated
Comment thread crates/composefs-oci/src/delta.rs Outdated
Comment thread crates/composefs-oci/src/delta.rs Outdated
Comment thread crates/composefs-oci/testdata/delta/generate.sh Outdated
Comment thread crates/composefs-oci/src/delta.rs Outdated
@alexlarsson alexlarsson force-pushed the delta-apply branch 2 times, most recently from d2ea945 to 0e43b30 Compare June 3, 2026 10:31
@alexlarsson

Copy link
Copy Markdown
Contributor Author

Ok, i think I got all the comments fixed.

@alexlarsson

Copy link
Copy Markdown
Contributor Author

Note: This doesn't actually run all the new tests in CI, as there is no oci-delta binary. However it passes locally here. We can build oci-delta ourselves, or we can just wait a bit until it gets packaged up and pick it up as an rpm. (This is in progress elsewhere).

Comment thread crates/composefs-oci/src/delta.rs Outdated
.get(tar_path)
.with_context(|| format!("{} not found in delta tar", tar_path.display()))?;
let mut buf = vec![0u8; entry.size as usize];
File::open(&self.path)?.read_exact_at(&mut buf, entry.offset)?;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems better to support DeltaFile holding an open fd.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that initially, but it caused problems in the parallel case, where fds were shared between threads, and then those shared the current offset, which broke things. Is there some pread based wrapper over File we could use for that?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct ReadAtReader<'a, F> {

Comment thread crates/composefs-oci/src/delta.rs Outdated
}

enum OciHasher {
Sha256(sha2::Sha256),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have code elsewhere using openssl for this. We're probably not consistent on this.

Comment thread crates/composefs-ctl/src/lib.rs Outdated
This allows applying an oci-delta oci-archive or registry artifact to
create an oci image based on a delta and a previous oci image in the
repo.

For details on oci-delta, see: https://github.com/containers/oci-delta

Signed-off-by: Alexander Larsson <alexl@redhat.com>
Assisted-by: Claude Code (Opus 4.6)
@alexlarsson

Copy link
Copy Markdown
Contributor Author

Ok, I redid this to be part of "oci pull", and it seems to work well. I can even pull a delta as an artifact:

$ target/debug/cfsctl --repo repo oci docker://quay.io/alexl42/test-delta:41
$ target/debug/cfsctl --repo repo oci pull docker://quay.io/alexl42/test-delta:41to42

@cgwalters cgwalters left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, this is looking nicer!

On tests, one thing we do in this project is copy over "fixtures" to ghcr.io/composefs. I think we could do the same with deltas. There's generic image mirroring logic.

That said it'd also help of course to have a static binary of oci-delta shipped from the upstream repo that we can use if installed (and install it in CI actions).

let mut immediate_results: HashMap<usize, (OciDigest, ObjectID)> = HashMap::new();
let mut stats = ImportStats::default();

let threads = available_parallelism()?;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we fetch OCI images today, we process the tar in a streaming fashion, writing objects into the composefs store as we go. Disk/memory usage is constant relative to the update size.

But one downside of deltas as designed/implemented today is that we have to download the entire delta layer to a file and regenerate the layer locally, so temporary disk space usage is O(delta size) plus O(layer).

I think for this reason I'd suggest we have at most 1 delta layer being downloaded, and 1 being processed at a time by default. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how common download-and-apply will be, but in the local-only (i.e. oci dir or oci archive) case we definitely want to parallelize the apply, as it greatly speeds up applying the delta (i saw times going from 40 to 10 secs). So, lets be careful of exactly how we limit this.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the oci: path we don't need to fetch anything at all, we should just parse from the existing files.

No harm to parallelizing the processing in that case. My concern is more about disk space usage spikes for nontrivial updates.

In the ociarchive: path...like ideally we can say that that file is not compressed (because the blobs will be, no point in double compressing them, and compressing the metadata brings almost no value).

Then we can do what this code is already doing I think, just walk the tar to find byte offsets.

I guess in the end I'm just arguing for the registry path to default to fetching one blob at a time, or maybe two.

Or maybe it's fine, and we can just make it configurable later if we need to.

(Some environment variable to control parallelism in composefs at a coarse grained level is probably enough, or perhaps just --jobserver-fd like https://make.mad-scientist.net/papers/jobserver-implementation/ )

@alexlarsson

Copy link
Copy Markdown
Contributor Author

I added binaries (build on cs10) to: https://github.com/containers/oci-delta/releases/tag/v0.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants