Skip to content

fix memory leak on lost packets#21

Draft
jeff-hykin wants to merge 4 commits into
rust-codegenfrom
jeff/fix/rust-frag-reassembly-leak
Draft

fix memory leak on lost packets#21
jeff-hykin wants to merge 4 commits into
rust-codegenfrom
jeff/fix/rust-frag-reassembly-leak

Conversation

@jeff-hykin
Copy link
Copy Markdown
Member

@jeff-hykin jeff-hykin commented May 20, 2026

When there are dropped LCM packets, the hashmap keeps data forever.

The C LCM version has an LRU eviction policy to avoid this becoming a memory leak, aso this PR ports that to Rust.

Three new tests to cover these cases.

Run the tests:

cd tools/rust/lcm
cargo test

Also added a burst_test example to characterize how much LCM can take before dropping. Run two terminals:

cargo run --release --example burst_test receiver 100
cargo run --release --example burst_test sender 500000 100 100

Mirrors the C library's `lcm_frag_buf_store` semantics from
udpm_util.{c,h}: cap the in-flight reassembly buffer at 16 MiB total
bytes and 1000 entries, evicting the least-recently-updated entry
when either limit is exceeded.

Before this change the Rust LCM port's reassembly map was an
unbounded HashMap with no eviction; one dropped UDP datagram in a
fragmented message left the other N-1 fragments stuck in the map
forever. Under sustained large-message (>65 KB) load with realistic
multicast loopback drop rates, incomplete entries accumulated, the
locked HashMap grew without bound, lock contention starved the
receive thread, and subsequent message drop rates compounded.

Discovered during DimOS PGO Rust port KITTI-360 benchmarking
(PassivePlatypus task): ~50% drop rate on 10 Hz × 500 KB
PointCloud2 publishes, with verified 128 MB SO_RCVBUF on both sides
of the channel and `set_queue_capacity(10000)` on every
subscription. Buffer-layer mitigations didn't move the drop rate
because the drops were upstream of the socket — in this map.

Implementation:
- Added `last_update: Instant` to `FragmentBuffer`. Updated on every
  fragment arrival on that entry.
- New `FragStore` wrapper around the HashMap tracks `total_bytes`
  alongside the entries.
- `FragStore::evict_lru()` picks the entry with the oldest
  `last_update` (linear scan — matches C library's
  `_find_lru_frag_buf` g_hash_table_foreach pass).
- `FragStore::enforce_caps()` evicts in a loop until both caps fit.
  Called from `process_fragment` only when a new entry was just
  inserted; completed-message removals already bring totals down so
  no enforce is needed there.

Caps:
- `MAX_FRAG_BUF_TOTAL_BYTES = 16 MiB` (matches C
  `MAX_FRAG_BUF_TOTAL_SIZE = 1 << 24`).
- `MAX_NUM_FRAG_BUFS = 1000` (matches C `MAX_NUM_FRAG_BUFS`).

Tests added (all passing):
- frag_store_evicts_when_over_byte_cap: insert >16 MiB across entries,
  assert total_bytes drops to <= cap and at least one entry evicted.
- frag_store_evicts_when_over_entry_cap: insert MAX_NUM_FRAG_BUFS+5
  small entries, assert count is exactly MAX_NUM_FRAG_BUFS after.
- frag_store_evict_lru_picks_oldest: two entries with stagged
  last_update, verify evict_lru removes the older one.

Existing loopback round-trip tests (6 tests) still pass — eviction
doesn't kick in for happy-path traffic since reassembly entries are
typically removed on completion within microseconds.

See ~/repos/jhist/wiki/rust-lcm-fragmentation-bug.md for the full
discovery narrative + C-vs-Rust diff.

Branch: jeff/fix/rust-frag-reassembly-leak
Adds an optional `recv_buf_size: Option<usize>` field to `LcmOptions`
that calls `socket2::set_recv_buffer_size` on the UDP receive socket
before binding. None (default) leaves the kernel default in place
(typically `net.core.rmem_default` on Linux after
`BufferConfiguratorLinux` raises it to 64 MiB).

Useful when the LCM client wants a deterministic, larger-than-default
buffer for high-rate publishers of fragmented messages — e.g. ~500 KB
PointCloud2 over LCM at 10 Hz. The matching change in
`dimos/protocol/service/lcmservice.py` passes `recv_buf_size` via the
URL query string for the Python LCM client; the Rust port needs an
equivalent path.

On Linux the kernel doubles the requested value, so requesting 64 MiB
yields a usable 128 MiB. socket2 silently clamps to `rmem_max`.

This is a non-breaking addition (defaults to `None`); existing
`LcmOptions::default()` callers see no behavioral change.
…limits

Usage:
  cargo run --release --example burst_test sender <msg_bytes> <count> <interval_ms>
  cargo run --release --example burst_test receiver <count>
  cargo run --release --example burst_test slow_receiver <count> <handler_ms>

Used during the dimos PGO investigation to bisect "is this LCM or the
application layer?". Findings on this host (Linux, lo interface,
SO_RCVBUF 128 MB after the URL-param fix):

  500 KB × 10 Hz × 10 s        — 0% drops (4.7 MB/s benchmark rate)
  500 KB × 89 Hz × 3 s         — 0% drops (42 MB/s)
  500 KB × 10 Hz × 100 s
    with 200 ms slow handler    — 27.6% drops (sustained back-pressure)
  500 KB × 100 msgs × 0 ms      — 0% drops (213 MB/s peak burst)

Confirms LCM drops only under sustained handler back-pressure where
cumulative backlog exceeds the SO_RCVBUF. Short bursts and well-paced
streams within the kernel buffer's capacity transmit cleanly.

Useful as a regression / hardware-characterization tool for future
hosts and as a unit-style test for LCM library changes.
@jeff-hykin jeff-hykin changed the title fix: evict stale fragments from reassembly map fix memory leak on lost packets May 20, 2026
jeff-hykin added a commit to dimensionalOS/dimos that referenced this pull request May 22, 2026
INPUT_CHANNEL_CAPACITY was 16. With LCM publishers at 10 Hz and any
handler that processes slower than line rate (e.g. PointCloud2 decode +
loop closure on 500 KB messages), the per-input mpsc fills in <2 seconds
and `try_send` starts dropping new messages.

This was the actual root cause of the ~50% "LCM drops" I'd been
investigating on the pgo_rust KITTI-360 benchmark. The kernel reports
zero drops (UDP RcvbufErrors=0, lo interface drop=0), and the LCM
library was fine — the drops were here, at the Rust dimos-module layer.

Empirical verification: 500-scan seq8 run with full ~500 KB clouds:
  - Before this change: 467/500 corrected_odometry_samples (6.6% loss)
  - After this change:  500/500 corrected_odometry_samples (0% loss)

Bumped both INPUT_CHANNEL_CAPACITY and PUBLISH_CHANNEL_CAPACITY to 1024.
At the same 10 Hz publish, the input queue now holds ~100 s of backlog
which is plenty for any reasonable handler.

The Rust LCM fragmentation-reassembly leak (PR
dimensionalOS/dimos-lcm#21) is still a real but separate bug — it
prevents memory leaks from dropped fragments but doesn't fix this drop
source. Both fixes are independently useful.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant