docs: correct memory-sharing model from #632 measurements by ejc3 · Pull Request #666 · ejc3/fcvm

ejc3 · 2026-06-12T06:19:25Z

Doc/comment-only corrections driven by the measured experiments in #632 (no code behavior change).

Summary

track_dirty_pages does NOT erode page-cache sharing — measured in [epic] Hypervisor-agnostic abstraction: support Firecracker AND Cloud Hypervisor (the two microVMs) behind a pluggable trait #632: 3× 1GiB File-backend clones ≈ 230MiB total PSS with tracking ON or OFF (<1MiB apart). Corrected the RestoreParams comment (src/commands/common.rs) and the --no-dirty-tracking help text (src/cli/args.rs), both of which claimed tracking defeats sharing. The flag's real costs: KVM logging overhead + 2MB→4K hugepage split.
The UFFD serve/clone path is lazy population, not cross-VM sharing — guest RAM is MAP_ANONYMOUS and UFFDIO_COPY makes a private per-VM copy of each faulted page (verified against the firecracker fork's persist.rs/vstate/memory.rs). Fixed .claude/CLAUDE.md UFFD sections, the --pid help text, and cmd_snapshot_run doc/output wording that said "memory sharing".
docs/hypervisor-abstraction.md refreshed with measured results: split shared_memory_clones into file_backed_cow_restore / external_uffd_lazy_restore / internal_uffd_lazy_restore; recorded the CH v52 ARM64 snapshot blocker (GetAarchCoreRegister EINVAL) and CH boot requirements (console=ttyAMA0, image_type=raw, --cpus boot=, P0.5 as guest-channel prerequisite); softened the "byte-for-byte identical vsock / one GuestChannel impl" claim to "same design, live CONNECT parity unverified pending P0.5".

Second commit addresses local codex review findings (stale wording that contradicted the corrected model).

Test Results

$ cargo fmt --check          # clean
$ cargo clippy --all-targets -- -D warnings   # clean

Comment/doc-only; measurements backing the claims: #632 (comment)

Measurements in #632 (3x 1GiB File-backend clones, smaps_rollup) disproved two long-standing claims: - track_dirty_pages does NOT erode page-cache sharing: ON vs OFF differed by <1MiB total PSS (230MiB for 3 clones either way). The mapping is MAP_PRIVATE in both modes; tracking's real costs are KVM logging overhead and the 2MB->4K hugepage split. Fixed the RestoreParams comment in commands/common.rs and the --no-dirty-tracking help text in cli/args.rs, which both claimed tracking defeats sharing. - The UFFD serve/clone path is lazy population, NOT cross-VM page sharing: guest RAM is MAP_ANONYMOUS and UFFDIO_COPY makes a private per-VM copy of each faulted page. Fixed the .claude/CLAUDE.md UFFD sections that claimed kernel page-cache sharing across UFFD clones. Also refreshed docs/hypervisor-abstraction.md with the measured results: split shared_memory_clones into file_backed_cow_restore / external_uffd_lazy_restore / internal_uffd_lazy_restore, recorded the CH v52 ARM64 snapshot blocker (GetAarchCoreRegister EINVAL) and the CH boot requirements (ttyAMA0, image_type=raw, --cpus boot=, P0.5 prerequisite for guest channels). Comment/doc-only change; no code behavior altered. Tested: cargo fmt --check, cargo clippy --all-targets -D warnings (clean)

claude-claude · 2026-06-12T06:21:36Z

🔍 Claude Review

SEVERITY: none

Findings

No issues found. This is a well-crafted documentation-only PR that corrects two previously incorrect claims based on actual measurements from #632:

UFFD sharing correction (.claude/CLAUDE.md): Correctly updates the UFFD clone description from "kernel page cache sharing across UFFD clones" to "lazy population with per-VM copies (UFFDIO_COPY)", and distinguishes File-backend CoW sharing (which does work via MAP_PRIVATE page cache) from UFFD behavior.
Dirty tracking correction (src/cli/args.rs, src/commands/common.rs): Removes the incorrect claim that track_dirty_pages defeats page-cache sharing. The updated comments correctly note sharing works in both modes (measured: <1MiB PSS difference).
Design doc refresh (docs/hypervisor-abstraction.md): Splits the monolithic shared_memory_clones capability into three precise capabilities (file_backed_cow_restore / external_uffd_lazy_restore / internal_uffd_lazy_restore), adds measured results, records CH v52 ARM64 snapshot blocker, and documents P1 boot requirements. The capability table, struct definition, hard-problems section, phased plan, and open research items are all internally consistent.

All four changed files are comment/doc-only — no code behavior altered, confirmed by diff inspection.

Summary

Clean documentation correction grounded in empirical measurements. No code, consistency, or accuracy issues found.

Review by Claude | Run

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 767ff6bd05

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-12T06:22:32Z

+    pub diff_snapshots: bool,             // FC: true,  CH: false
+    pub file_backed_cow_restore: bool,    // FC: true (measured, #632), CH: unverified (snapshot blocked)
+    pub external_uffd_lazy_restore: bool, // FC: true (fork handshake w/ page_size), CH: false
+    pub internal_uffd_lazy_restore: bool, // FC: n/a,   CH: ondemand (v52; unverified on ARM64)


Gate CH ondemand restore until ARM64 snapshots work

For the ARM64 scope this design is documenting, CH v52 cannot create snapshots (GetAarchCoreRegister EINVAL is recorded later in this same file), so advertising internal_uffd_lazy_restore as a CH capability here can lead the implementation/test matrix to enable a restore path that is known to fail before restore even starts. Please make this capability false or explicitly arch/version-gated until snapshot creation works on ARM64.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-12T06:22:32Z

+   - UFFD clones populate lazily (faulted pages are per-VM copies); File-backend
+     clones share clean pages via the page cache (measured in #632)


Remove the stale one-copy RAM claim

This section now correctly says UFFD faults become per-VM copies, but the next bullet still claims an original plus two clones use only one VM's RAM (~512MB). For UFFD clones that fault the same working set, memory can grow toward one copy per VM (plus the server's cached snapshot), so agents reading this status note can still repeat the old sharing assumption and under-size clone experiments; scope that number to an unfaulted/lazy case or replace it with the measured File-backend result.

Useful? React with 👍 / 👎.

Local codex review of the previous commit flagged wording that still contradicted the corrected model: - args.rs --pid help and snapshot.rs doc/output called the UFFD path 'memory sharing' — now 'lazy on-demand paging' / 'served on-demand' - CLAUDE.md clone RAM claim qualified as idle/working-set behavior - design doc no longer claims byte-for-byte vsock identity / a proven single GuestChannel impl — live CONNECT parity is unverified pending P0.5 (matches the open research items) Tested: cargo fmt --check, cargo clippy --all-targets -D warnings (clean)

chatgpt-codex-connector Bot reviewed Jun 12, 2026

View reviewed changes

ejc3 merged commit bc6da9a into main Jun 12, 2026
6 of 17 checks passed

ejc3 deleted the work-632-sharing-docs branch June 12, 2026 06:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: correct memory-sharing model from #632 measurements#666

docs: correct memory-sharing model from #632 measurements#666
ejc3 merged 2 commits into
mainfrom
work-632-sharing-docs

ejc3 commented Jun 12, 2026 •

edited

Loading

Uh oh!

claude-claude Bot commented Jun 12, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 12, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		- UFFD clones populate lazily (faulted pages are per-VM copies); File-backend
		clones share clean pages via the page cache (measured in #632)

Conversation

ejc3 commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Results

Uh oh!

claude-claude Bot commented Jun 12, 2026

🔍 Claude Review

Findings

Summary

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ejc3 commented Jun 12, 2026 •

edited

Loading