Conversation
PR SummaryMedium Risk Overview Reviewed by Cursor Bugbot for commit d220659. Bugbot is set up for automated code reviews on this repo. Configure here. |
6d2b804 to
314abd0
Compare
1ab27f8 to
4f0b47b
Compare
| return cache, nil | ||
| } | ||
|
|
||
| err = cache.copyFromMemfd(ctx, memfd, ranges) |
There was a problem hiding this comment.
Adding note about the lazy copy in the bg: main...lazy-memory-prototype#diff-489cdf017544bf247b96eb923da0d4273b87fe0b08162fbb801d2ad73148e120R126
❌ 6 Tests Failed:
View the full list of 12 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
4ae4ebb to
9198a96
Compare
|
Update: I've removed the logic that punches holes in the memfd, progressively after copying data into the diff file. I've ran some experiments and got some signal about this causing increase in CPU utilization and slowing down PAUSE and RESUMEs. I think that we can proceed with adding support for memfd and revisiting after the deduplication work. |
ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one. |
d245968 to
ae76b44
Compare
830d2cd to
2015b36
Compare
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
- 🔴
packages/orchestrator/pkg/sandbox/uffd/uffd.go:178-184— The cleanup loopfor fd := range fds { syscall.Close(fd) }at uffd.go:178-181 uses Go's single-variable range form, which iterates over indices, not values. On theNewUserfaultfdFromFderror path this closes the orchestrator's stdin (fd 0) and stdout (fd 1) — corrupting any stdout-bound logging — while leaking the actual UFFD and memfd that were received from Firecracker (the memfd in particular pins the entire guest memory mapping in the kernel).
Fix is for _, fd := range fds { syscall.Close(fd) }.
Extended reasoning...
What the bug is
In Go, for x := range slice with a single loop variable iterates over the slice's indices, not its values. To iterate over values you need for _, x := range slice. The newly added cleanup at packages/orchestrator/pkg/sandbox/uffd/uffd.go:178-184 reads:
if err != nil {
for fd := range fds {
syscall.Close(fd)
}
return fmt.Errorf("failed to create uffd: %w", err)
}fds is []int (returned by syscall.ParseUnixRights a few lines above). With the loosened length check at line 167 (len(fds) != 1 && len(fds) != 2), fds will have length 1 or 2 here, so fd takes the values 0 and (when a memfd was passed) 1 — the indices into fds, not the FD numbers Firecracker actually handed over via SCM_RIGHTS.
Step-by-step proof
Suppose NewUserfaultfdFromFd fails after Firecracker has sent the orchestrator a UFFD fd (kernel-assigned number 42) and a memfd (kernel-assigned number 43):
syscall.ParseUnixRightsreturnsfds = [42, 43].- Length check passes (
len(fds) == 2). userfaultfd.NewUserfaultfdFromFd(uintptr(fds[0]), …)returns an error.- The cleanup loop runs. Because of single-variable range semantics, the loop body sees
fd = 0, thenfd = 1. syscall.Close(0)closes the orchestrator's stdin.syscall.Close(1)closes the orchestrator's stdout — and now any zap logger writing to stdout will silently fail, or, worse, write to whatever fd the kernel reassigns 0/1 to next (a sandbox socket, a database conn, etc.). That's a real correctness/security hazard, not just lost log lines.- fds 42 and 43 — the actual UFFD fd and memfd — stay open in the orchestrator's fd table for the lifetime of the process. The memfd is the costly leak: it holds an mmap of the whole guest memory region in the kernel.
Why existing code doesn't prevent it
The cleanup defer added at lines 187-198 only runs after u.handler.SetValue(uffd) is reachable — i.e., it only handles the success path's later teardown. The early-return branch at line 178 is exactly the place this leak was supposed to be addressed, so the defer can't catch it.
Impact
Two distinct failures fire on every NewUserfaultfdFromFd error after FD handover:
- Logging/IO corruption: closing fd 0 and fd 1 corrupts the orchestrator's standard streams. The kernel reuses the lowest free fd numbers for subsequent
open/socket/acceptcalls, so future stdout writes can land in arbitrary fds. - FD + memory leak: the UFFD fd and memfd are leaked. The memfd in particular pins the entire guest memory mapping until the orchestrator process exits.
The trigger is rare (any failure inside NewUserfaultfdFromFd after Firecracker has already sent the SCM_RIGHTS fds), but when it fires the consequences are severe and exactly opposite to what the cleanup was added to prevent.
How to fix
Change one line:
for _, fd := range fds {
syscall.Close(fd)
}This iterates over fd values instead of indices, closing the actually-received fds and leaving stdin/stdout alone.
4d06cf8 to
384e6a1
Compare
c91c781 to
0576f1c
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 0576f1c. Configure here.
We are making a larger change to enable memfd-backed guest memory in Firecracker. When enabled, Firecracker passes over the memfd file descriptor over the UFFD UDS, alongside the UFFD file descriptor. Parse this and create a Memfd object, which we can later use to interact with the sandbox memory. Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Currently, orchestrator calls process_vm_readv() system call to copy memory from Firecracker process into the cache backing file. Add logic to use read directly from memfd, when that is present. Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Add a feature flag that controls whether the orchestrator will instruct Firecracker to use memfd for backing the guest memory. Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
There was a problem hiding this comment.
An organization admin can view or raise the cap at claude.ai/admin-settings/claude-code. The cap resets at the start of the next billing period.
Once the cap resets or is raised, push a new commit or reopen this pull request to trigger a review.

What
In Unix OSs
memfdis an anonymous file that can be used to back memory. Firecracker uses this construct when it needs to share memory with external processes (currently, when using vhost-user devices).Currently, when we take a snapshot of the sandbox (for example, during
PAUSEoperations) we need to copy its memory usingprocess_vm_readv.memfdallows us to do this in a more idiomatic way.Why
memfdallows us to have a direct view of the sandbox memory from the orchestrator without having to copy memory across processes. Moreover, if the orchestrator holds a reference tomemfd, we can post process the sandbox memory after the Firecracker process is killed. This opens up possibilities for various latency and memory utilization optimizations.What we do in this PR is that we change the cache logic to use memfd to copy Firecracker memory into the diff file if the memfd is present.