oci: Add varlink APIs using "splitdirfdstream"#309
Conversation
|
how will this work with podman-container-tools/container-libs#651 ? I don't think the container-tools are going to add again varlink as a dependency for this use case only |
|
One thing to bear in mind is there's two levels to this, the splitdirfdstream (replacing splitfdstream) and the higher level IPC mechanism by which that is passed around as representations of OCI layers. Only the latter involves varlink, and we could choose to keep using jsonrpc-fdpass for the latter in container-tools if you prefer. Sorry about the proposed strategy change, but as I dug in more it just feels right - and I somehow again just missed the fd passing support in the varlink ecosystem. I guess one thing that changed is zlink is relatively new, and is well maintained and good code. (The varlink/rust project went through a messy time) Would varlink be heavier than jsonrpc-fdpass? Hmm, let me see...I spent some tokens on this varlink/go vs jsonrpc-fdpass-go: binary size comparison🤖 Assisted-by: OpenCode (Claude Sonnet 4.6) Measured against the
With a realistic service stub, both libraries are similar in weight (~68–80 KiB). That said, varlink/go#43 needs doing. |
Or, it'd probably work to use gRPC for most metadata/controlplane but pass fds over a separate negotiated socket. I don't have a really strong opinion. One other thing I'd say here that I think is a big cleanup is that our parsing of Also the converse is now true - it should now be straightforward for external tooling to push content into composefs-rs in an efficient way. |
This is a blocker for the containers/storage integration. I have no preference whether it is varlink or jsonrpc, but should we hold this until we know we can use it in containers/storage too? |
ac7e9f6 to
5ad1961
Compare
| - **External chunk** — `length` bytes of a file, resolved via | ||
| `openat(dirfds[index], filename)` and read from offset 0. |
There was a problem hiding this comment.
will the producer or consumer take care of metacopy files with overlay?
There was a problem hiding this comment.
Good catch! It was a bit confusing to me, we don't default to metacopy=on in the overlay driver, but we do set it in containers-common in https://github.com/containers/common/blob/a5ccdae846b629b5ceaefa6ffd5c6511409c3487/rpm/update-config-files.sh#L31
We should now handle that on the producer side.
On the consumer side - we don't need to care about metacopy for writing to composefs(-rs) - we get natural object-content based deduplication.
But I think you're pointing out something interesting in that conceptually splitdirfdstream is "lossy" in that this will just appear there as a full copy up.
So a theoretical consumer side of this in the default overlay driver mode we'll end up with a full file copy, not a metacopy from a lower layer.
But on the other hand - that's also what must happen when pushing to a registry, and so far this protocol is trying to conceptually be "an isolated tar layer, but with ability to reflink content".
So...if we got to the point of using this protocol to fix podman-container-tools/container-libs#144 then we would probably want to add some sort of "hint" that a particular file is just a metacopy?
OTOH, the zstd:chunked code path in containers/storage will naturally handle this (of course this gets into the "convert existing images" issue) - but if we can agree that really the more medium-term direction is a composefs based object store, then we don't need to worry about this in that medium term.
I generated a PR in varlink/go#44 I think we don't need to block this strictly speaking, it seems in the end better for us to temporarily use a patched varlink/go than to use the custom jsonrpc-fdpass right? Also, if we can take this track to finally replace the current experimental-image-proxy protocol with one thing it'd be overall a large win. |
@mheon are you fine with that? |
|
Sorry for not following this one closely. Are we talking about doing a hard dependency on Varlink in order to use composefs with c/storage? Do we know what that's going to do to our binary size? |
yes, to add it as the RPC to communicate between Go and Rust. @cgwalters made a comparison here: |
5ad1961 to
b50f5dd
Compare
…timeout The integration CI jobs were hitting GitHub's 6-hour job limit, which caused silent cancellation rather than a real test failure. Two issues combined to cause this: 1. nextest's integration profile had terminate-after = 60, meaning a single hung test could run for 1200 × 60 = 72 000 s (20 h) before nextest force-killed it, far exceeding the 6-hour GHA limit. 2. The integration job had no explicit timeout-minutes, so a stalled job wasn't surfaced as a clear timeout failure. Fix both: drop terminate-after to 1 (each test gets exactly 20 min before nextest terminates it, which is generous for VM boot + test execution), and add timeout-minutes: 90 to the integration job so any build or runner hang fails cleanly rather than silently burning 6 hours. Assisted-by: OpenCode (Claude Sonnet 4.6) Signed-off-by: Colin Walters <walters@verbum.org>
First, I discovered that actually fd-passing with varlink generally works well, and I was misguided in thinking we needed jsonrpc-fdpass. Almost: one issue is that varlink doesn't have good support for passing *a lot* of file descriptors (which jsonrpc-fdpass was designed to handle). But upon some reflection, I realized we don't need to pass a file descriptor per file, all use cases here are fine with a directory fd plus filename. So here a new data stream format "splitdirfdstream" is implemented. We first now use that *internally* when we're doing a direct pull from containers-storage for reflinking/hardlinkling. But better: let's expose that data concept over varlink, where a varlink client can both pull or push container image layers that way. This paves the way to a very clear mechanism for us to integrate with containers-storage or other storage stacks (like containerd) in an agnostic way. We also now support `cfsctl oci copy` to copy across composefs repositories which is also implemented this way. Generated-by: OpenCode (Claude Opus 4.8) Signed-off-by: Colin Walters <walters@verbum.org>
b50f5dd to
a5d52a9
Compare
First, I discovered that actually fd-passing with varlink generally works well, and I was misguided in thinking we needed jsonrpc-fdpass.
Almost: one issue is that varlink doesn't have good support for passing a lot of file descriptors (which jsonrpc-fdpass was designed to handle).
But upon some reflection, I realized we don't need to pass a file descriptor per file, all use cases here are fine with a directory fd plus filename.
So here a new data stream format "splitdirfdstream" is implemented.
We first now use that internally when we're doing a direct pull from containers-storage for reflinking/hardlinkling.
But better: let's expose that data concept over varlink, where a varlink client can both pull or push container image layers that way.
This paves the way to a very clear mechanism for us to integrate with containers-storage or other storage stacks (like containerd) in an agnostic way.
We also now support
cfsctl oci copyto copy across composefs repositories which is also implemented this way.Generated-by: OpenCode (Claude Opus 4.8)