Skip to content

WIP: Rough draft for updated generic OCI sealing#226

Draft
cgwalters wants to merge 13 commits into
composefs:mainfrom
cgwalters:sealing-impl
Draft

WIP: Rough draft for updated generic OCI sealing#226
cgwalters wants to merge 13 commits into
composefs:mainfrom
cgwalters:sealing-impl

Conversation

@cgwalters

Copy link
Copy Markdown
Collaborator

This is just some rough draft raw material that builds on:

@cgwalters cgwalters force-pushed the sealing-impl branch 2 times, most recently from 1ce192a to 063ff54 Compare February 12, 2026 16:49
Comment thread crates/cfsctl/src/main.rs Outdated
composefs_oci::signing::FsVeritySigningKey::from_pem(&cert_pem, &key_pem)?;

// Build subject descriptor from the source image's manifest
let manifest_json = img.manifest().to_string()?;

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm we actually need to operate on the raw original representation, can't rely on to_string() always giving us the same thing.

Comment thread crates/cfsctl/src/main.rs Outdated
/// Image reference (tag name)
image: String,
/// Path to the OCI layout directory (must already exist)
oci_layout_path: PathBuf,

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use clap(value_parser) into an ocidir directly or so

Comment thread crates/composefs-oci/src/lib.rs Outdated
/// the container to be mounted with integrity protection.
///
/// Returns a tuple of (sha256 content hash, fs-verity hash value) for the updated configuration.
pub fn seal<ObjectID: FsVerityHashValue>(

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be cleaner if we do a prep commit that removes the old sealing as we know we're not going to do it anymore.

/// # Returns
///
/// The number of referrer artifacts exported.
pub fn export_referrers_to_oci_layout<ObjectID: FsVerityHashValue>(

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this could land as a prep commit

Comment thread crates/composefs-oci/src/oci_image.rs Outdated
use std::fs;
use std::io::Write;

let blobs_dir = oci_layout_path.join("blobs").join("sha256");

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ocidir

format!("{seed:02x}").repeat(32)
}

fn sample_subject() -> Descriptor {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's unify this stuff with shared infra to generate an ocidir with known content

Comment thread crates/composefs/src/fsverity/ioctl.rs Outdated
Comment thread crates/composefs/src/fsverity/ioctl.rs Outdated
Comment thread crates/composefs/src/fsverity/keyring.rs Outdated
Comment thread crates/composefs/src/fsverity/keyring.rs Outdated
@cgwalters cgwalters force-pushed the sealing-impl branch 3 times, most recently from 361eeb7 to 2f93e4a Compare March 6, 2026 12:24
@cgwalters

Copy link
Copy Markdown
Collaborator Author

This one will need to logically depend on #225 because that one has a lot of hardening for the EROFS parser

@cgwalters cgwalters force-pushed the sealing-impl branch 2 times, most recently from 6b676dd to d226f55 Compare March 14, 2026 21:30
@cgwalters cgwalters force-pushed the sealing-impl branch 2 times, most recently from 83ea13e to 13f1957 Compare April 4, 2026 12:53
@cgwalters cgwalters force-pushed the sealing-impl branch 3 times, most recently from 0f06a47 to e0c049a Compare May 23, 2026 12:41
cgwalters added 13 commits June 7, 2026 08:09
The biggest goal here is support for Linux kernel-native fsverity
signatures to be attached to layers, which enables integration with
IPE.

Add support for a fully separate OCI "composefs signature" artifact
which can be attached to an image.

Now, in discussion it came up that we could re-do an incremental
fetch design on top of this, and digging into that I think
that really wants a "canonical tar" format.

Add RFC/plans for all of that.

Assisted-by: OpenCode (Claude Opus 4.5)
Signed-off-by: Colin Walters <walters@verbum.org>
Extract sub-second nanoseconds from PAX extension mtime headers.
The tar-core crate only keeps the integer seconds; we now read the
fractional part ourselves and populate st_mtim_nsec accordingly.

Prep for V1 EROFS compatibility where nanosecond timestamps matter
for bit-for-bit reproducibility with the C composefs implementation.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Add expected_v1_id and expected_v1_bootable_id fields to the
ContainerImage test struct and pin values for all four fixture
images. This extends the digest stability test to cover the V1
EROFS writer alongside the existing V2 coverage.

Prep for the V1 EROFS OCI integration.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Migrate OCI crate callers to the new RepositoryConfig API and add
dual-format (V1+V2) EROFS image generation during OCI pull.

The V1 kernel cmdline karg uses a new self-describing format:

  composefs.digest=v1-sha256-12:<hex>
  composefs.digest=v1-sha512-12:<hex>

The value encodes the EROFS format version, hash algorithm, and
block size before the digest, mirroring how meta.json uses
fsverity-sha256-12. The stable key name composefs.digest works
naturally with ConditionKernelCommandLine= and allows multiple
entries on the same cmdline for different algorithm/format
combinations.

The initramfs (composefs-setup-root) parses all composefs kargs
from the kernel cmdline in order, then tries to mount each image
in sequence — the first image that actually exists in the
repository wins. mount_composefs_image_if_exists() maps
ImageNotFound to Ok(None), letting the mount loop skip missing
images without swallowing real errors (verity mismatch,
permissions, etc.).

The legacy composefs=<hex> karg continues to work for V2 EROFS.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
fuser 0.17 is needed to support multithreaded FUSE sessions: the new
API requires `Filesystem: Send + Sync + 'static`, which forces proper
Arc-based ownership of the filesystem state and makes it possible to
safely hand the implementation to multiple worker threads.

The breaking API changes and how they are addressed:

- `&self` instead of `&mut self` on all trait methods: the only mutable
  state (open file handles) is now protected by a Mutex.
- New newtypes (INodeNo, FileHandle, LockOwner, Generation) and bitflags
  (OpenFlags, FopenFlags) — updated at call sites.
- readdir/read offsets changed from i64 to u64.
- Session::from_fd now takes SessionACL + Config separately.
- Session::run() is no longer public; replaced by spawn().join().
- reply.error() takes fuser::Errno instead of raw i32.

To satisfy the `'static` bound, serve_tree_fuse() now takes
`Arc<FileSystem>` and `Arc<Repository>`. A pre-built flat Vec<InodeData>
(indexed by ino-1) replaces the old HashMap<Ino, InodeRef<'a>>, removing
the lifetime that was incompatible with `'static`. An InodeLookup index
(path→ino for dirs, LeafId→ino for leaves) handles child ino resolution
without raw pointers.

Assisted-by: OpenCode (claude-sonnet-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Wire the composefs-fuse crate into cfsctl behind a new `fuse` cargo
feature (on by default) and expose it through both the command line and
the varlink RPC API, with an integration test exercising the FUSE mount
end to end.

CLI surface:
  - `cfsctl fuse-serve <image> <mountpoint>` serves an EROFS composefs
    image over FUSE from a file on disk.
  - `cfsctl oci mount --fuse[=<opts>]` FUSE-serves an OCI image's EROFS
    instead of doing a kernel composefs mount, so it works without
    fs-verity on the backing store. `--fuse=passthrough` opts into
    kernel-bypass reads (Linux 6.9+). Options are parsed via a small
    FuseOptions FromStr so the surface can grow without new flags.

Varlink surface:
  - `org.composefs.Repository.FuseServe` and `org.composefs.Oci.OciFuseMount`
    let a client drive FUSE mounts over the RPC socket. Both take a
    `wait` parameter: with `wait=true` the call blocks for the session;
    with `wait=false` the FUSE session is detached into a background task
    and the call returns once the mount is registered, so a caller can
    mount and then go on to use the filesystem.

The privileged_fuse_dumpfile_roundtrip integration test spawns
`cfsctl fuse-serve` as a subprocess, polls for mount readiness via st_dev
change, reads external files directly, and compares the dumpfile produced
by `cfsctl create-dumpfile` over the FUSE mount against the expected
output from write_dumpfile, asserting the FUSE implementation reports
every piece of metadata the dumpfile format captures. Uses
similar_asserts for readable diffs on mismatch.

Assisted-by: OpenCode (claude-sonnet-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Implement readdirplus (combined readdir + lookup in one round-trip),
no-op forget (inode table is static for session lifetime), and
FOPEN_KEEP_CACHE on open replies.

Serve with one thread per logical CPU using FUSE_DEV_IOC_CLONE
(clone_fd=true) so each worker gets its own /dev/fuse fd, eliminating
per-request channel lock contention. Arc<OwnedFd> allows read() to
clone the handle and drop the mutex before calling pread, so concurrent
reads on the same file don't serialise.

Add FUSE passthrough support (Linux 6.9+): when FuseConfig::passthrough
is true and the kernel advertises FUSE_PASSTHROUGH, external file reads
are routed directly in-kernel to the repository object fds. Opt-in via
FuseConfig because passthrough requires root and a non-tmpfs backing
filesystem.

Assisted-by: OpenCode (claude-sonnet-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Prep for OCI sealing, which needs the byte block size and hash digest
size when validating composefs.* artifact annotations. These mirror the
helpers on the (soon-to-be-removed) ComposeFsAlgorithm so signature.rs
can use the canonical Algorithm type directly.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Add a `keyring` feature that exposes `inject_fsverity_cert` and
`KeyringError`, backed by `keyutils 0.4` (which provides
`Keyring::new` and `keytypes::Asymmetric` needed to add X.509
certificates to the kernel's .fs-verity keyring).

The implementation uses `keyutils::Keyring::new` to locate the
`.fs-verity` special keyring and `add_key` to inject PEM-decoded
DER certificates.

Assisted-by: OpenCode (claude-sonnet-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
…est, ioctl)

Add `algorithm.rs` (ComposeFsAlgorithm enum for EROFS/signature types),
`formatted_digest.rs` (hex-encoded digest with known format), and extend
`ioctl.rs` with `fs_ioc_enable_verity_with_sig` to pass a PKCS#7
signature blob when enabling verity.

The `fsverity::mod` re-exports `inject_fsverity_cert` from
composefs-ioctls under the `keyring` feature, and exposes
`enable_verity_raw_with_sig` for callers that have pre-computed the
fs-verity descriptor and signature.

Adapted for PR#297/306: removed duplicate ComposeFsAlgorithm type;
canonical type is composefs::fsverity::Algorithm.

Assisted-by: OpenCode (claude-sonnet-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Add three new modules:
- `signing.rs`: PKCS#7/openssl-backed `FsVeritySigningKey` for
  signing fs-verity digests, with PEM cert/key parsing.
- `signature.rs`: `SignatureArtifactBuilder` that constructs an OCI
  artifact manifest containing EROFS layers, PKCS#7 signature blobs,
  and a config descriptor, implementing the composefs signing spec.
  Also exposes `sign_image`/`verify_image_signatures` for CLI and
  varlink consumers, and `parse_signature_artifact` for the verify path.
- `referrers.rs`: `find_composefs_artifacts` to fetch OCI referrer
  manifests locally from repo, used by the verify command.

Update `image.rs` / `oci_image.rs` / `boot.rs` / `lib.rs` to adapt
to upstream API changes (FormatVersion-aware helpers, OciRefNotFound on
ENOENT, containers_image_proxy oci-spec import paths).

Add `openssl` as a dep (signing.rs) and optional `oci-client` feature.

Adapted for PR#297/306: FormatVersion threading through all digest/image
helpers, Algorithm type is composefs::fsverity::Algorithm throughout.

Assisted-by: OpenCode (claude-sonnet-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Add OCI sealing and signing workflows to cfsctl:
- `oci seal`: Commits EROFS images for all layers and the merged
  rootfs into the repository.
- `oci sign`: Creates a composefs PKCS#7 signature OCI artifact.
- `oci verify`: Fetches referrer artifacts and validates EROFS layer
  digests against embedded signatures.
- `oci run` / `oci stop`: Runs a container from a pulled OCI image
  by generating an OCI runtime spec, mounting a composefs overlay,
  and invoking crun/runc.
- `keyring add-cert`: Injects an X.509 PEM certificate into the
  kernel's .fs-verity keyring (requires CAP_SYS_ADMIN).
- `oci export`: Exports an image to an OCI layout directory.
- `oci composefs-digest-karg`: Print composefs kernel cmdline arg.

Adapted for PR#297/306: use composefs_oci::sign_image/verify_image_signatures
(library fns), ComposefsCmdline API for karg generation, version threading.

Assisted-by: OpenCode (claude-sonnet-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Mirror the OCI sealing CLI surface onto the org.composefs.Oci varlink
interface so external callers (and a future cfsctl-as-client) can drive
sealing through structured RPC instead of scraping CLI output.

Seal and Verify are the primary RPC surface; Sign is included for
completeness. Certificate and key material is passed as PEM *content*
rather than file paths: the daemon may run in a different user or mount
namespace than the client and cannot reliably read client-side files.
The private key therefore transits the (local, typically root-owned)
Unix socket — noted in the method docs.

Mounting and `keyring add-cert` are intentionally not exposed: both need
CAP_SYS_ADMIN and operate on the daemon's own filesystem/host view, so
the verification gate (Verify, returning a count) is the useful RPC
primitive and the mount syscall stays with the caller.

Two new OciError variants — InvalidCertificate and
SignatureVerificationFailed — let clients distinguish a genuine
verification failure from an internal error. The wrong-cert integration
test asserts the typed error rather than a bare failure, so a no-op
check would not pass.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
@alexlarsson

Copy link
Copy Markdown
Contributor

I've been doing some experimental work on something similar with the current composefs support in podman at podman-container-tools/podman#28658. That code encodes the expected composefs digests in the manifests and assumes it will be reproducibly generated from the tar layers locally. To get trust in that I then have to validate the manifest signature at runtime. I think long term I prefer the approach proposed here of distributing the erofs images directly, particularly because it allows kernel side signatures that can integrate with IMA, etc.

However... I wonder if not just signing the erofs image is missing something. Don't we actually want to sign the entire container specification, not just the file content? What about other parts of the image, like exposed ports, volumes, cmdline, etc? With the signature on the erofs blob we leave all the stuff in the manifest and config "unprotected".

@cgwalters

Copy link
Copy Markdown
Collaborator Author

Don't we actually want to sign the entire container specification, not just the file content? What about other parts of the image, like exposed ports, volumes, cmdline, etc? With the signature on the erofs blob we leave all the stuff in the manifest and config "unprotected".

In the current design, the manifest and config are always stored as external objects, and we can then apply fsverity signatures to those as well.

It's a bit buried in the spec, but see https://github.com/composefs/composefs-rs/pull/224/changes#diff-def2a4eef2075f93a81da71d729f633c5f748feff036fd3789252ee37cf37dfdR281

@cgwalters

Copy link
Copy Markdown
Collaborator Author

A simple way to say it is, I think with this proposal we don't actually need to sign the manifest via cosign at all - runtime integrity >= transport integrity.

But of course in practice most use cases would do so, because you still want the transport integrity for deployments that aren't using runtime integrity yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants