Skip to content

Cleanup: PR #317 follow-on perf + configurable inbound read commitment#328

Open
heifner wants to merge 4 commits into
masterfrom
cleanup/opp-pr317-followon
Open

Cleanup: PR #317 follow-on perf + configurable inbound read commitment#328
heifner wants to merge 4 commits into
masterfrom
cleanup/opp-pr317-followon

Conversation

@heifner
Copy link
Copy Markdown
Contributor

@heifner heifner commented May 8, 2026

Follow-on to #317 picking up the small items that didn't make it into that PR. Two themes.

Configurable inbound read commitment (Solana + Ethereum)

Replaces hardcoded commitment_t::confirmed (Solana) and block_tag_latest (Ethereum) on the inbound read paths with config knobs. WIRE consensus on inbound is committed forward against these reads; a destination-chain slot reorg below the chosen commitment leaves WIRE state derived from external history that no longer exists.

  • --solana-inbound-read-commitment -- processed | confirmed (default) | finalized
  • --ethereum-inbound-read-block-tag -- latest (default) | earliest | pending | safe | finalized

Danger of the current defaults

Defaults preserve current behavior so the upgrade is a no-op for existing deployments, but the current behavior is not safe under external-chain reorgs. The defaults are kept for compatibility, not because they are correct.

Ethereum latest is genuinely unsafe. latest is the canonical chain head and reorgs of 1-3 blocks are routine in normal mainnet operation. The batch operator can read a slot at latest, deliver that envelope to WIRE via msgch::deliver, achieve WIRE-side consensus on it, queue attestations off it -- and then watch the source slot disappear from Ethereum's canonical chain seconds later. WIRE is now committed to attestations derived from history that no longer exists. Operators on Ethereum mainnet should set this to safe (~6 min, two beacon epochs) at minimum and ideally finalized (~12.8 min, beacon-chain finality).

Solana confirmed is materially safer but still pre-final. confirmed is supermajority lockout (33%+ stake voted), much stronger than processed (single-vote leader), but the canonical chain can still revert below confirmed during supermajority skip events. Documented to occur during cluster instability. Operators that prefer safety over latency should set this to finalized (~13-31 s additional latency vs confirmed, one supermajority lockout finalization round).

A future PR may flip the defaults to finalized once the latency cost is fully measured against the OPP cycle's wall-clock budgets. Until then operators on production are on the hook to set these explicitly.

sysio.msgch perf + robustness cleanup

  • deliver: switch the env_epoch mismatch check to the sysio::check lambda overload so the formatted error string is built only on failure. The original was running a triple std::to_string concat on every successful delivery.
  • evalcons: drop the messages emplace + matching contains/erase pair. Net storage delta was already zero; the KV ser + undo-log + up-to-64 KiB peak-RAM blip per consensus reach are now gone.
  • buildenv: replace the post-pack hard-abort backstop with a trim loop. Estimator drift previously dead-lettered (PROCESSED reverts on rollback, same set re-packed next epoch, same trip); the loop pops and re-serialises until under cap, then marks survivors PROCESSED. Self-correcting under estimator drift.

Plus a static_assert(std::endian::native == std::endian::little) on outpost_solana_client::read_u32_le to document the implicit host assumption (Borsh u32 is little-endian on the wire; native-endian memcpy is correct only on a little-endian host).

heifner added 4 commits May 8, 2026 10:33
The Solana outpost client read inbound envelopes at hardcoded
commitment_t::confirmed; the Ethereum outpost client at hardcoded
block_tag_latest. WIRE consensus on inbound is committed forward
against these reads, so a slot reorg below the chosen commitment
leaves WIRE state derived from external history that no longer
exists.

Expose:

* --solana-inbound-read-commitment {processed,confirmed,finalized}
  default: confirmed
* --ethereum-inbound-read-block-tag
    {latest,earliest,pending,safe,finalized}
  default: latest

Plugin parses the option in plugin_initialize, validates it
(magic_enum for Solana, allow-list for Ethereum), stores it on
impl, and passes it through outpost_*_client construction.

Also: outpost_solana_client::read_u32_le now has
static_assert(std::endian::native == std::endian::little) so a
future port to a big-endian host fails at compile time instead of
silently producing wrong epoch values.
The env_epoch mismatch check was building the error string
unconditionally on every successful delivery via triple
std::to_string concatenation. Switch to the lambda overload of
sysio::check so the formatted message is constructed only when the
check fails.
The messages_t row was being inserted with raw_payload up to
MAX_ENVELOPE_BYTES (64 KiB) and erased ~60 lines later in the same
action. Net storage delta is zero; the work it cost (KV serialisation,
undo-log entries, peak-RAM blip on the contract's payer) is now gone.

The messages_t / message_entry schema is left in place. If a
downstream consumer of inbound message rows ever materializes, the
emplace can be reintroduced and the matching cleanup left out.

Updates the audit-cleanup comment block to reflect the narrower
inline-drain set (envelopes only).
The previous flow marked candidate attestations PROCESSED before
serialising the envelope and used a hard check(packed.size() <=
MAX_ENVELOPE_BYTES) as a backstop against estimator drift. If that
check ever tripped, the action aborted and all state including the
PROCESSED markings rolled back; the next epoch's buildenv would
re-pack the same candidate set against the same estimator and trip
the same check, dead-letter style.

Replace the abort with a trim loop: serialise, and if the result
overshoots the cap, pop the last entry and re-serialise until under
the cap. PROCESSED marking moves after the loop converges, so a
popped entry never needs a status revert; popped entries stay READY
and ride the next epoch's buildenv. The size invariant is now
self-correcting under estimator drift.

The first-attestation-too-big guard stays in place upfront (estimator
short-circuit) and is also enforced inside the loop (catches the
estimator-under-counts case where even the first entry alone
overshoots the actual serialised cap).

Regenerates contracts/sysio.msgch/sysio.msgch.{wasm,abi} from the
combined sysio.msgch.cpp changes across this and the prior two
commits.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant