Skip to content

fix(node): evict imported blocks' transactions from mempool + periodic expiry#136

Merged
lai3d merged 1 commit into
mainfrom
claude/mempool-evict-on-import
Jul 5, 2026
Merged

fix(node): evict imported blocks' transactions from mempool + periodic expiry#136
lai3d merged 1 commit into
mainfrom
claude/mempool-evict-on-import

Conversation

@lai3d

@lai3d lai3d commented Jul 5, 2026

Copy link
Copy Markdown
Collaborator

Fixes #135.

Problem

Mempool eviction only happened at self-production (producer.rs:457). Blocks imported from other validators never evicted their txs, and select_with_nonce skips stale-nonce txs without removing them — so with 3 validators in round-robin, ~2/3 of every node's pool became immortal zombies. Observed on the post-reset testnet: oldest_tx_age climbing linearly past 45 min (T2 alert at 15m permanently red), node-3 depth stepping up monotonically.

Fix

  • SyncManager::evict_imported_txs — on every successful import_block (all 4 sites: gossip handle_block, fetched-block task, pending-blocks drain, range-sync catch-up) remove the block's txs from the local pool, keyed by blake3(tx.to_bytes_without_signature()) — the same key space the pool uses at insert and producer.rs uses at production.
  • Periodic remove_expired() in the catch-up tick (5s) as a backstop for txs that will never be mined under their exact hash (e.g. superseded duplicates). Previously the pool's 1h tx_lifetime had zero callers.
  • Contract test in pool.rs pinning the recomputed-hash key space, so a future serialization change that silently breaks eviction fails CI.

Verification

  • cargo test -p qfc-mempool 7/7 (incl. new test)
  • cargo test -p qfc-node unit 18/18, integration 6/6 (release binary)
  • Post-deploy check: T2 dashboard oldest pending tx age should saturate and reset instead of climbing forever; node-3 depth should drain. Will verify on the next image rollout during the soak.

🤖 Generated with Claude Code

…135)

Mempool eviction only ran at self-production (producer.rs); blocks
imported from other validators never evicted their txs, and
select_with_nonce skips stale-nonce txs without removing them, so every
tx mined by another node became an immortal zombie: oldest-tx-age SLI
climbed monotonically (45+ min observed on testnet), depth stepped up,
and the 15m T2 alert was permanently red.

- SyncManager::evict_imported_txs: on every successful import (gossip,
  fetched, pending-drain, range-sync — all 4 sites) remove the block's
  txs by blake3(to_bytes_without_signature), the pool's own key space.
- Catch-up tick now calls remove_expired() as a backstop (previously
  zero callers — even the 1h tx_lifetime never ran).
- pool.rs test pins the recomputed-hash key-space contract.

Tests: qfc-mempool 7/7, qfc-node unit 18/18, integration 6/6 (release).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@lai3d lai3d merged commit e215c96 into main Jul 5, 2026
4 checks passed
@lai3d lai3d deleted the claude/mempool-evict-on-import branch July 5, 2026 05:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Mempool never evicts transactions mined by other validators — immortal zombie txs, permanent oldest-tx-age alert

1 participant