trace_api: history-solution upgrade - auto ABIs, query APIs, indexes#295
Open
heifner wants to merge 33 commits into
Open
trace_api: history-solution upgrade - auto ABIs, query APIs, indexes#295heifner wants to merge 33 commits into
heifner wants to merge 33 commits into
Conversation
Add first_recorded_block() and last_recorded_block() to slice_directory and store_provider by scanning the highest/lowest index slice files. On the first block_start signal after startup, check_continuity() in chain_extraction_impl_type validates the relationship between existing trace data and the current chain head: - No prior data: fresh start, log and proceed. - Chain head within [first_recorded, last_recorded+1]: overlap or exact continuation; re-applied blocks overwrite existing slice entries, which allows recovery from disk corruption by replaying from a snapshot. - Chain head < first_recorded: snapshot predates trace history start, error with operator guidance to delete the trace directory. - Chain head > last_recorded+1: forward gap, error with guidance. 12 new unit tests cover all cases in test_continuity.cpp.
Replaces the O(n) linear scan in get_trx_block_number() with a compact open-addressing hash table (load factor <=0.5, power-of-2 bucket count, linear probing) written as a trace_trx_idx_<range>.log sidecar per slice. The maintenance thread builds the index atomically (write .tmp, rename) after each slice becomes irreversible. Lookups do a per-slice fast-path through the reader; the linear scan fallback remains for slices not yet indexed. Also fixes _max_filename_size to account for the 14-char "trace_trx_idx_" prefix (was sized for the 12-char "trace_index_" prefix).
…i/--trace-no-abis Replace the file-based --trace-rpc-abi option with automatic ABI capture: - abi_store: sorted on-disk index mapping (account, global_seq) -> ABI bytes, enabling O(log n) point-in-time lookups per action. Written atomically via temp-file rename; loaded at startup for continuity across restarts. - chain_extraction: lazy-fetches each new account's ABI via find_account_metadata on first observation, and captures setabi transactions in real time to track ABI version changes at the exact global_sequence where they took effect. - abi_data_handler: replaced static add_abi() map with an abi_lookup_fn callback that performs versioned (account, global_seq) lookups from the ABI store. Decoding failures are now soft (log at debug, fall back to raw hex) since ABIs are auto-captured rather than operator-provided. - Removed --trace-rpc-abi and --trace-no-abis options from nodeop, all integration tests, TestHarness, config templates, tools, tutorials, and the examples/abis/ directory of hand-maintained ABI files.
… endpoints get_actions: paginated action search over a block range with optional filters on receiver, account, and action name. Pagination uses an after_global_seq cursor and returns more+last_global_seq for the next page. ABI-decoded params are included when available. get_token_transfers: convenience preset of get_actions with receiver=account=token_contract (default sysio.token), action=transfer. Using receiver=token_contract yields exactly one result per transfer -- the canonical execution, not the inline notification copies. Also fixes a latent UB: fc::to_hex is now guarded against empty data vectors (nullptr .data()). Adds 12 unit tests covering filters, pagination, multi-block scan, ABI decoding, and the token transfer deduplication behaviour.
…test assertions - Add --trace-max-query-limit (default 1000, -1=unlimited) so operators running private nodes can remove per-request caps on get_actions / get_token_transfers queries - Add comprehensive plugin documentation covering all endpoints, configuration options, on-disk layout, ABI decoding, pagination, and exchange/indexer integration guidance - Extend nodeop_run_test.py with get_actions and get_token_transfers assertions (params decoded, trx_id matching, receiver filtering) - Fix Cluster.py: add trailing space after producer_api_plugin arg to prevent concatenation with subsequent extra args when spacer was removed
Prevent integer overflow in the ABI blob offset accumulator when total ABI data exceeds 4 GB. blob_offset and blob_size in abi_store_index_entry are promoted from uint32_t to uint64_t (struct grows from 24 to 32 bytes); the local running_offset and blob_offsets vector in abi_store_writer::write() follow suit, removing the truncating static_cast<uint32_t> casts.
Implements spring#1438 by capturing and exposing the full set of action receipt and execution-tree fields on every action trace: - action_trace_v0 now carries action_ordinal, creator_action_ordinal, closest_unnotified_ancestor_action_ordinal, recv_sequence, auth_sequence (flat_map<name, uint64_t>), code_sequence, abi_sequence, account_ram_deltas, and optional cpu_usage_us / net_usage (populated for top-level input actions, where producers set deterministic budgets). - authorization_trace_v0: rename account -> actor to match on-chain naming. - account_delta_v0: new struct for account_ram_deltas. - JSON output uses "name" (was "action") for the action name and "actor" (was "account") inside authorization entries. Remove failed-action support: - chain_extraction filters context-free AND failed action traces (at.except set) from stored block traces. - transaction_trace_v0 no longer carries a status field; all persisted transactions are executed. - get_transaction_trace / get_actions / get_token_transfers no longer accept or return a "failed" indicator. Slim response preserved: get_token_transfers returns only transfer-relevant fields (omits ordinals, receipt sequences, ram_deltas, cpu/net usage). Callers needing those can call get_actions directly. Tests and plugin docs updated accordingly.
Addresses AntelopeIO/leap#1219: /v1/trace_api/get_block response time varies by block height, with a worst case at the end of each trace-slice-stride window (~200ms for the last block of a 10k-block slice). Root cause: store_provider::get_block scanned trace_index_<range>.log from offset 0, unpacking up to stride-many metadata_log_entry records before finding the target block's offset. Fix: add trace_blk_idx_<range>.log, a flat sparse array of stride-many uint64_t slots indexed by (block_num - slice_base). Each slot holds offset+1 into trace_<range>.log (0 reserved as empty). The sidecar is written synchronously alongside the existing metadata log, pre-allocated on creation, and updated in place on fork re-writes. Read path (get_block): one seek + one read on the sidecar. Falls back to the existing scan when the sidecar is missing or the slot is empty, preserving correctness for unusual states. Uses the in-memory best_known_lib for O(1) irreversibility, replacing the lib-entry scan. Write path (append): adds one 8-byte pwrite per block alongside the existing trace + metadata appends. Trivial cost; file is sparse on Linux (~80KB per slice at default stride). Cleanup: include trace_blk_idx_* in slice rotation; doc updates.
…ge cap Drops `limit`, `more`, `last_global_seq`, and `after_global_seq` from action_query and actions_result. Adds `--trace-max-block-range` (default 100) which silently clamps `block_num_end` to `block_num_start + max - 1` in the HTTP handler. Clients paginate by advancing `block_num_start` by `max_block_range` between calls. Removes the unbounded scan that an unauthenticated empty-body request to `/v1/trace_api/get_token_transfers` could previously trigger against block 0..UINT32_MAX. Sort by `global_sequence` inside `get_actions_impl` is retained: chain `action_traces` arrive in schedule order, which is NOT execution order when an action queues both `require_recipient` and inline actions. Iterating without the sort would mix a notification handler's inline ahead of later notifications. Behavior matches `chain_plugin`'s `push_transaction` tree shape. Doc tag-along: also updates the `abi_store.log` layout block to reflect the previously-widened blob_offset/blob_size (uint64) — pre-existing stale doc, fixed while updating other doc sections.
Bulk get_actions queries over a single contract were rebuilding the ABI serializer per action: a 100-action response triggered 100 abi_def unpacks and 100 abi_serializer constructions. abi_serializer construction walks all types in the ABI and is non-trivial. Adds a 128-entry LRU keyed by (account.value, global_seq) protected by a mutex. Cache misses do the expensive abi_def unpack and abi_serializer construction OUTSIDE the lock so concurrent cache users do not block on a slow miss. On a race where two threads miss for the same key, the second observer of the inserted entry wins and the duplicate construction is discarded.
The reader was re-opening the file on every lookup and trusting cached _blob_area_offset. Between the writer's atomic rename and the swap of the shared_ptr<abi_store_reader>, an in-flight HTTP-thread reader holding the OLD reader (OLD index, OLD blob_area_offset) could resolve the path to the NEW inode and seek to an offset computed against the old layout inside the new file, returning garbage bytes. Holding the file via boost::iostreams::mapped_file_source pins the inode through the rename, removes the per-lookup syscalls (open + seek + read + close), and lets us bounds-check the blob slice against the mapping. The on-disk layout matches the in-memory struct on x86_64 (fc::raw::pack is little-endian, structs are unpadded), so we memcpy the header and index out of the mapping at construction. Widens abi_store_header::entry_count to uint64_t and drops _reserved. Header stays 16 bytes via 4+4+8 layout. Format version intentionally not bumped: chain has not launched.
Rewriting the entire abi_store.log on every captured ABI was O(N^2) during
fresh-start replay: each first-encountered contract triggered a full sort
and rewrite of the growing file (plus a reload of the mmap-backed reader).
With hundreds to thousands of distinct contracts touched in early blocks,
the extraction thread was spending seconds on file I/O per block.
Replaces it with an append-only log:
Header (16 bytes): magic "ABIL" (u32), version (u32), reserved (u64)
Records: account(u64) + global_seq(u64) + blob_size(u64)
+ blob_bytes + crc32
Appends stream to the tail under a mutex. The lookup index lives in memory
as std::map<(account, global_seq), {offset, size}>, built by scanning the
file at startup and validating each record's CRC32. Lookups take the index
mutex only long enough to copy (offset, size), then pread the blob from
the held cfile's fileno -- concurrent lookups never serialize on I/O.
Writes are not fsync'd. Tail corruption from a kernel crash is handled by
the startup recovery scan: any CRC failure truncates the file at the first
bad record. Lost entries are recaptured the next time their contract is
touched (setabi observation or lazy current-ABI fetch), so no replay is
required.
Separate mutexes for appending vs index so lookups do not block on a
concurrent append's file write. CRC uses boost::crc_32_type, matching
the convention in fc::datastream_crc and finalizer.hpp.
Format version intentionally not bumped beyond 1: chain has not launched,
so any existing abi_store.log files from earlier builds of this branch
can be deleted before upgrading. No auto-migration code.
on_applied_transaction runs AFTER all actions in a transaction have been applied, so lazy-fetching the current chain-DB ABI on first encounter of an account produces the POST-setabi ABI when that account's ABI is being replaced in the same trx. Recording it as account@global_seq=0 would then be served by upper_bound step-back for any pre-setabi action on that account, decoding those actions with the wrong schema. Fix: scan action_traces once for setabi targets, then on the second pass skip the lazy fetch for any account whose ABI is being replaced in this trx. For the narrow edge case (never-before-observed contract with a setabi in its first-observed trx AND pre-setabi actions on it), those pre-setabi actions now return raw hex -- strictly safer than wrong data, and the correct behavior once any earlier trx has recorded an ABI. Also documents the caveat in trace_api_plugin.md and adds three extraction tests (skip-when-target, prior-ABI-survives, sibling-lazy- fetch-still-fires).
…rst-encounter The lazy-fetch path tracked an std::unordered_set<uint64_t> of every account ever observed, growing without bound for the lifetime of the node. Naive LRU eviction is unsafe: re-fetching after eviction would overwrite an existing X@0 record with the post-any-interim-setabi ABI and poison pre-setabi lookups. Replace with abi_log::has_entry(account) -- if the log already records any (account, *) entry, lazy fetch is skipped. Memory is now bounded by the number of contracts that actually have an ABI captured (a few to a few thousand on realistic chains), not by everything-ever-seen. For the rare case of a contract account whose ABI is empty in chainbase, has_entry stays false and we re-fetch on every action -- one chainbase find_account_metadata call per action, microseconds. Same change converts uint64_t-typed name fields/keys throughout the plugin to chain::name (record_header.account, index_key, cache_key, setabi_targets_this_trx). chain::name is layout-compatible with uint64_t so the on-disk format is unchanged; the in-memory code is type-safe and drops .to_uint64_t() conversions.
The trx_id index reader trusted bucket_count from the file header, so a
corrupt or hostile index could:
- allocate ~32 GB at startup (header.bucket_count = UINT32_MAX),
- corrupt lookups (header.bucket_count not a power of two breaks the
`& mask` math),
- hang lookups forever (a fully-populated bucket array makes the
probe-for-empty-slot loop never terminate).
Validate at construction:
- bucket_count must be a power of two (std::has_single_bit),
- bucket_count must be <= 2^28 (~268M buckets, ~4 GB; covers any
realistic slice configuration),
- file size must equal sizeof(header) + bucket_count * sizeof(bucket).
On any failure, mark the reader invalid; lookups return nullopt and
get_trx_block_number falls back to the linear scan over the trx_id log.
Bound the probe loop in lookup() at bucket_count iterations so even a
hand-crafted file with no empty buckets terminates without hanging.
While here, also bound --trace-slice-stride to [1, 1000000]. The
default is 10000 and the practical realistic range tops out around 100K.
The cap prevents an absurd configuration from making the per-slice
block-offset sidecar pre-allocate gigabytes and from pushing trx_id
bucket_count past the 2^28 cap added above.
Adds 4 reader tests (non-power-of-two, above-cap, file-size mismatch,
fully-populated table doesn't hang).
…rite The trx_id index path had two related correctness gaps that diverged from the linear-scan get_trx_block_number behavior: 1. trx_id_index_writer claimed "last write wins per prefix" but actually wrote the second entry to the next empty bucket, leaving both. The reader returned whichever the linear probe reached first -- usually the FIRST inserted entry, opposite of the documented intent. Fix the writer to also stop on prefix match and overwrite the existing slot. 2. build_trx_id_index iterated every block_trxs_entry in the trx_id log and added each (trx_id, block_num) pair, ignoring fork resolution. When a trx briefly appeared in a block that was later forked out (and possibly replaced by a different trx set at the same block_num, or the trx moved to a later block), the index would point at the forked-out block_num instead of the canonical one. Fix by first collapsing the log into a canonical map<block_num, ids> -- last block_trxs_entry per block_num wins, matching the linear scan's trx_block_nums.erase logic -- then feeding that to the writer. Updates the existing duplicate_prefix64_last_write_wins test to actually assert the lookup returns the latest value (was previously satisfied by "doesn't crash"), and adds an integration test that exercises three fork patterns: trx moved to a later block, trx removed entirely, and trx canonically at its original block.
The Startup continuity check section claimed gaps "do not prevent the node from running" -- but check_continuity in chain_extraction.hpp calls except_handler on a gap, which is wired to app().quit(), so the node actually shuts down. Update the doc to describe the real behavior: shut-down on gap, with operator-facing recovery steps (delete trace dir, load a snapshot covering the gap, or copy slices from another node). Also drops the stale "snapshot restore detected -> warning" row -- the code logs nothing on overlap, it just resumes silently. No code change.
…tpicks Three small fixes bundled: 1. Promote the named "trace_api" logger to a shared header (include/sysio/trace_api/logging.hpp) so every translation unit in the plugin can route log output through it. Previously only trace_api_plugin.cpp used the named logger; abi_log, trx_id_index, store_provider, and chain_extraction logged via the default logger, so operator filtering on the "trace_api" entry in logging.json only affected a fraction of plugin output. All wlog/dlog/ilog/elog call sites in the plugin now go through fc_*log(_log, ...). Also adds diagnostic logs to the previously silent setabi catch blocks in chain_extraction so a malformed setabi action no longer disappears without trace. 2. slice_number_from_path returns std::optional<uint32_t> on parse failure rather than letting std::stoul throw out of a public method. The caller in get_trx_block_number now falls back to its existing linear scan when the filename can't be parsed. Named-local return type chosen for NRVO. 3. etc/config/nodeop/aio/config.template.ini: replace the orphaned Trace API Plugin comment block with a short note that ABIs are now captured automatically (the prior trace-no-abis line was removed earlier in this PR but the surrounding context left it unclear what operator action was needed, if any -- answer: none).
…naming
Six small cleanups bundled:
* trx_id_index reader/writer use bulk I/O (single read/write of the
whole bucket array) instead of per-bucket fc::raw pack/unpack +
datastream creation. Layout-equivalent on x86_64 LE; static_assert
on bucket size guards the assumption.
* test_trace_file fixture: replace IIFE-style imperative builders
with C++20 designated-init aggregates (now that action_trace_v0 has
17 fields, positional aggregate init was unwieldy and the IIFE was
worse).
* request_handler: delete duplicated process_authorizations from .cpp;
promote serialize_authorizations to an inline free function in the
trace_api namespace; use it from both get_block (response_formatter)
and the get_actions / get_token_transfers handlers.
* request_handler: drop redundant `data.empty() ? "" : fc::to_hex(...)`
conditionals (4 sites) -- fc::to_hex(ptr, 0) returns "" without
dereferencing the pointer.
* store_provider: compute _max_filename_size at compile time across
every prefix and extension using std::max; adding a longer prefix
later auto-grows the buffer instead of silently overflowing.
* All headers converted from `namespace sysio { namespace trace_api {`
to `namespace sysio::trace_api {` (C++17 nested form, matching the
rest of the plugin).
* Reserved padding fields renamed `_reserved` -> `reserved` (the
leading-underscore convention is for private members; these are
public on-disk struct fields).
* Comment on the trx_id writer's bucket vector now explains that the
zero-init is load-bearing for the empty-slot sentinel that
terminates the probe loop.
…fication opt-in Applies 30 fixes from the pre-PR review of feature/trace-api-history: Correctness - trace-max-block-range clamped to [1, 10000]; -1 rejected (was unbounded) - first_and_last_recorded_blocks() replaces two separate optionals so callers see a consistent view; NRVO in implementation - abi_data_handler cache keyed by effective ABI global_seq (via new abi_log::lookup_result) so bulk queries hit the cache - trx_id index hit confirmed against the block's block_trxs_entry before returning; collisions fall through to linear scan - get_transaction_trace scans raw transaction_trace_v0[] and builds the variant for the matching trx only; no more JSON-string round-trip API - get_actions: include_notifications flag (default false); when off and exactly one of account/receiver is set, the other is mirrored - Response envelope includes the actual block_num_start/end scanned - decode_error field surfaces ABI decode failures; params keep their decoded value when only return_value decode fails - On-disk magics reversed so a hex dump reads BLIX/TRIX/ABIL/WIRE Hygiene - File-scope constexpr _n literals for setabi / sysio.token / transfer - std::vector -> std::deque for trx_id_index_writer entries (growth cost) - abi_log: best-effort remove on open failure; blob_offset renamed to blob_file_offset; pread/stdio-flush note; append-side-is-single-threaded note; last-write-wins note at _index[...] sites - trace-max-block-range default raised from 100 to 1000 - Continuity-check error text names specific recovery remedies - Lazy ABI fetch exception logged at debug with account + message - Removed empty trace_api_rpc_plugin_impl::set_program_options and call sites - Log prefix "trace_api:" on plugin-init log lines - Miscellaneous struct alignment, comment clarifications, sort-stability note, yield_exception catch-order note Tests - test_continuity: new single-invocation-guarantee case with a non-throwing except_handler - Mocks updated for first_and_last_recorded_blocks / decode / lookup_result - test_trx_id_index uses id.data_size() - 1 instead of hardcoded 31 Docs - trace_api_plugin.md rewritten: user-facing sections first, on-disk layout moved to an "Implementation details" section at the end - etc/config/nodeop/aio/config.template.ini: commented trace-max-block-range line added Shared helper build_action_variant(action, decoded, variant_shape) moved from three near-duplicate inline builders to request_handler.cpp so get_actions, get_token_transfers, and process_block all agree. Verified: trace_api_plugin builds clean; test_trace_api_plugin passes all 97 test cases; nodeop links cleanly.
- chain_extraction: fmt::format for continuity error messages; drop _ prefix on abi_fetcher / startup_checked members for consistency with the rest of the file - abi_data_handler: clarify serialize_to_variant comment -- it's the active path for get_block / get_transaction_trace, not legacy - trace, request_handler, store_provider: whitespace / comment nits - tests/sysio_util_snapshot_info_test: flush NamedTemporaryFile before sys-util reads it (v2 format reads the footer, so the file must be fully on disk); update head_block_id after regen - Regenerate unittests/snapshots/* and consensus_blockchain/snapshot with the corrected WIRE magic number
Three tests read transaction_trace responses from the trace_api and accessed the action name via transaction["actions"][0]["action"]. That field was renamed to "name" in this PR; update the call sites.
…t of hot loop Filter actions before the global_sequence sort so transactions whose actions are all rejected by the receiver/account/action filter skip the sort entirely - the common case when a request scans thousands of blocks. std::sort replaced with std::ranges::sort and a member-pointer projection on action_trace_v0::global_sequence. Also in get_actions_impl: - Reuse the matches vector across trxs/blocks; clear() keeps capacity so repeat scans avoid per-trx allocations. - Hoist trx.id.str() and the other trx-level fields (block_num, block_time, producer_block_id) out of the match emit loop; a multi-match trx no longer repeats the checksum->hex conversion. - Materialize has_receiver/has_account/has_action + unwrapped chain::name values once per call; inner predicate compares names directly instead of dereferencing std::optional each time. All 11 get_actions_tests cases and the full 97-case trace_api suite pass.
Adds trace_recv_bloom_<range>.log per slice containing two boost::bloom filters (K=7, FPR=0.01): one over action_trace_v0::receiver and one over packed (receiver, action) pairs. get_actions consults the bloom once per slice and advances block_num past the slice on a negative probe, turning the "receiver never appears in this slice" case from a 10,000-block scan into an O(1) file read. Pieces: - bloom_sidecar.hpp: header-only bloom_builder + bloom_reader over boost::bloom::filter<uint64_t, 7>. On-disk header uses the same uint32 magic convention as blk_offset_index_header (0x42524957 -> "WIRB" on little-endian) with in-class initializers and a reserved pad word for natural alignment. Body is recv bits + recv_action bits + trailing CRC32. Reader rejects bad magic, version mismatches, truncation, and CRC mismatches; on rejection may_contain_* returns true so the caller falls back to a scan (fail-safe - a false negative would silently drop matching actions). - store_provider::append() feeds every block's actions into a per-slice builder and flushes to disk at slice roll-over via temp + rename. A crash between roll-overs leaves the in-progress slice without a bloom; the scan fall-back keeps query correctness at the cost of one scan-only slice until retention ages it out. - request_handler::get_actions_impl opens the bloom lazily per slice, probes the receiver filter (when set) and the (receiver, action) composite (when both are set), and skips the remainder of the slice on a negative probe. Unfiltered queries don't touch the bloom. - slice_directory::run_maintenance_tasks removes trace_recv_bloom_<range>.log alongside the other per-slice files during retention pruning so the sidecar doesn't leak as data is aged out. Uses boost::bloom (Boost 1.89, header-only via vcpkg) and boost::unordered_flat_set, both already in the installed header set. Test coverage (10 new cases, all 107 trace_api_plugin cases pass): - bloom_sidecar_tests: round-trip hits/misses, empty slice produces valid always-miss file, add_block walks every action, missing file / bad magic / CRC corruption / truncation / version mismatch all fail-safe to invalid reader. - get_actions_tests: valid bloom with the queried receiver absent causes the entire slice to be skipped without any get_block call; no-filter query does not consult the bloom.
…fensive cap
Batch of mechanical + tightening changes from pre-PR review (items 2, 3, 4, 7, 8, 9, 10, 14):
- get_actions_impl: tighten skip_eligible to `has_receiver` since both bloom probes require a receiver. Previously `has_account && has_action` could open the sidecar for no probe benefit on `include_notifications = true` queries.
- get_actions_impl: rename local filter values receiver_v/account_v/action_v to *_name for readability (matches chain::name type; no "_v suffix" convention elsewhere in the codebase).
- bloom_sidecar: rename struct field `k_hashes` to `k_hash_count` so it no longer shadows the namespace constant bloom::k_hashes; removes the disambiguating-qualification wart.
- bloom_sidecar: add `max_capacity_bits = 128 MiB` defensive bound in bloom_reader::load. A corrupted or maliciously-crafted sidecar with an absurd capacity could previously trigger a huge std::vector allocation; this caps allocations at a realistic maximum (~500x a busy-mainnet slice bloom).
- bloom_sidecar: port bloom_builder::finalize_and_write and bloom_reader::load from std::ifstream/std::ofstream to fc::cfile for consistency with the rest of the plugin's sidecars. Reader catches std::exception broadly so any fc::cfile failure (open, read) still falls back to the scan path via _valid == false.
- bloom_sidecar: add `noexcept` to may_contain_receiver and may_contain_recv_action. Both are const, pure-compute, and the invalid-path is a plain return; annotating lets the compiler elide exception-handling metadata in the get_actions inner loop.
- bloom_sidecar: comment cleanup (`=>` -> `->`).
New test:
- bloom_sidecar_tests/filter_capacity_roundtrip_invariant pins the boost::bloom guarantee `filter{f.capacity()}.capacity() == f.capacity()` (documented in boost/bloom/detail/core.hpp:480) across item counts from 1 to 1000. A future boost upgrade that quietly breaks this would fail the test rather than silently disabling the skip path in production.
All 108 trace_api_plugin tests pass.
) Moves the per-slice bloom sidecar build out of the synchronous append path and into slice_directory::run_maintenance_tasks, alongside build_trx_id_index. The write is triggered by LIB crossing the slice rather than by a slice roll-over in append(). Why: under the earlier design a fork that crossed a slice boundary could overwrite an already-flushed bloom with an incomplete one built only from the replayed blocks. Walk-through: slice K gets flushed when the first block of slice K+1 appends; a fork rolls back into slice K; the next append detects the backward roll-over, flushes the partial slice K+1 builder, then flushes a near-empty slice K on the subsequent forward roll-over - overwriting the correct bloom. Queries for a receiver in slice K's pre-fork blocks would then get a bloom miss and silently drop the action from the response, defeating the fail-safe. LIB crossing is the natural guard: a fork cannot reach back across LIB, so the slice's data log is final by the time the bloom is built. Reading the data log in order still picks up any stale records left by earlier forks within the slice - but that's safe: bloom allows false positives (a forked-out receiver probes as present, the query scan visits the slice and finds no canonical match). False negatives are the only fatal mode and are eliminated by construction. Changes: - slice_directory::build_recv_bloom(slice_number, log): new method mirroring build_trx_id_index. Opens the uncompressed trace log, streams through each block_trace_v0 record, inserts every action into a bloom_builder, finalize_and_writes the sidecar. Skips if already built, if the slice has no uncompressed data (already compressed, or never written), if the trace file is empty (0 bytes), or if no record could be parsed (corrupted input). All I/O errors are captured via FC_LOG_AND_DROP; a failed build leaves no file and the query path falls back to scanning that slice. - slice_directory::_last_bloomed_slice: new tracker analogous to _last_indexed_slice. - slice_directory::run_maintenance_tasks: add a second process_irreversible_slice_range pass for bloom building, scheduled after trx_id-index build and before retention pruning / compression so the uncompressed .log is still available. - store_provider::append: remove the in-append rollover flush. Bloom building is no longer coupled to slice rollover. - store_provider: remove _current_bloom_slice and _current_bloom_builder members; bloom state now lives only on disk. Tests (3 new cases in slice_tests): - slice_dir_recv_bloom_build_on_lib: asserts the sidecar is absent immediately after append() and present after run_maintenance_tasks crosses LIB past the slice; verifies probes hit appended receivers and largely miss never-appended ones (<=1 false positive across 5 probes). - slice_dir_recv_bloom_fork_in_slice: appends a block, forks, replays with a different receiver. Verifies canonical receivers probe present, forked-out receiver also probes present (harmless false positive), and never-appended receivers largely miss. - slice_dir_recv_bloom_cross_slice_fork: the exact scenario that motivated this fix. Writes blocks spanning slice 0 and slice 1, forks the tail of slice 0 and head of slice 1, then advances LIB first past slice 0 only (slice 1 bloom absent) and then past slice 1 (both present). Asserts every canonical receiver in slice 0 probes as present - this would have failed under the earlier design. All 110 trace_api_plugin tests pass. nodeop links cleanly; plugin_test sweep green. Documentation updated to reflect the new build model.
The magic byteswap was extracted to PR #309 targeting feature/kv-secondary-primary-id. This branch goes back to master's reference snapshot files and the original 0x57495245 magic; reference files will be regenerated on top of #309 when it lands. Reverted: - libraries/chain/include/sysio/chain/snapshot.hpp - magic restored to 0x57495245 - unittests/snapshots/{blocks.log,snap_v1.bin.gz,snap_v1.bin.json.gz, snap_v1.json.gz} - reverted to master - unittests/test-data/consensus_blockchain/snapshot - reverted to master - tests/sysio_util_snapshot_info_test.py - head_block_id reverted; flush fix retained
…istory # Conflicts: # tests/sysio_util_snapshot_info_test.py # unittests/snapshots/blocks.log # unittests/snapshots/snap_v1.bin.gz # unittests/snapshots/snap_v1.bin.json.gz # unittests/snapshots/snap_v1.json.gz
get_actions previously emitted only the action-level cpu_usage_us / net_usage on each action variant. These are per-action and in different units from the transaction-level totals (action net_usage is bytes, trx net_usage_words is ceil(net_usage / 8)), so callers that needed per-transaction resource totals - e.g. PerformanceHarness's post-test extraction - could not derive them: filtering by action name drops sibling actions, and even with all actions the units differ. Add trx_cpu_usage_us (uint32_t) and trx_net_usage_words (fc::unsigned_int) alongside the existing per-trx trx_id, block_num, block_time, and producer_block_id fields on every emitted action. The values are hoisted once per parent trx so a multi-match trx doesn't repeat the field reads.
Follow-on to 61906f8. Per the slim shape's existing intent ("omits the resource usage fields"), trx_cpu_usage_us and trx_net_usage_words are now gated to full shape; get_token_transfers no longer emits them. Also document the two new fields in trace_api_plugin.md - example response, action field table, and the slim-omits list - and add a slim test asserting both the action-level and trx-level resource fields are absent.
…ions get_block already exposes per-block "status" (irreversible/pending). get_actions and get_token_transfers had no equivalent, so callers had to mix in chain/get_info LIB to know if an action's block was final -- that read is not correlated with trace_api's data log and can disagree with the trace data they just consumed. Sourcing block_status from the same get_block tuple keeps trace_api as the single source of truth: every action emitted from a block carries that block's finality literal at the moment of read. The slim shape (get_token_transfers) emits it too -- exchanges crediting transfers need finality just as much as general consumers. Operators that want only-irreversible responses can still run nodeop with read-mode = irreversible; every block returned will then carry "irreversible". The literal is hoisted once per block (shared by every trx and every action in the block) rather than recomputed per emission. Test mock fixture gained a per-block pending override, with a new test covering both irreversible and pending blocks across full and slim shapes. Doc updated with the new field on both endpoints' examples and the get_actions response-field table, including the irreversible-mode note.
…ture fee1815 added optional cpu_usage_us / net_usage to action_trace_v0 and updated test_extraction.cpp's expected fixtures to set both to fc::unsigned_int{0} on every action, but the make_action_trace helper that drives the actual chain::action_trace inputs was not updated. to_action_trace copied the (empty) optionals through, so the actual JSON omitted both fields while expected JSON contained "cpu_usage_us":0 and "net_usage":0 -- block_extraction/basic_single_transaction_block and block_extraction/basic_multi_transaction_block both failed the block_trace_v0 equality check. Set both fields on the chain::action_trace returned by the helper so the fixture is internally consistent.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Upgrades
trace_api_pluginto serve as a complete history solution for exchanges and indexers. Five themes:Auto-captured ABIs.
--trace-rpc-abiand--trace-no-abisare removed. Plugin captures ABIs from chain directly (setabi traces in real time, lazy-fetch on first encounter), versioned perglobal_sequenceso actions decode with the exact ABI that was in effect when they ran. Append-onlyabi_logreplaces the earlier O(N^2)abi_storerewrite.Continuity enforced at startup. Gaps between existing trace data and chain head shut the node down with operator-facing guidance (load a snapshot covering the gap, copy slices from another node, or delete the trace dir). Overlap from a snapshot restore is tolerated (re-applied blocks overwrite existing slice entries).
O(1) lookups. Per-slice
trace_trx_idx_<range>.log(open-addressing hash table) replaces the linear scan inget_transaction_trace; per-slicetrace_blk_idx_<range>.log(sparseuint64_tarray) replaces the linear scan inget_block. Both sidecars are built alongside the existing metadata log and fall back to the scan if a sidecar is missing or corrupt.New query endpoints over the existing trace data.
Per-slice bloom for
get_actionsslice-skip. Newtrace_recv_bloom_<range>.logsidecar carries twoboost::bloomfilters (K=7, FPR=0.01): one over actionreceiver, one over packed(receiver, action)pairs.get_actionsprobes the bloom once per slice in the requested block range; on a negative probe the entire slice is skipped with noget_blockcall. Turns "receiver never appears in this slice" from a full-slice scan into a single file read. Missing or CRC-corrupt sidecar falls back to the scan (fail-safe - a false negative would silently drop matching actions from the response). Written live during extraction at slice roll-over; cleaned up alongside the other per-slice files during retention pruning.New endpoints
POST /v1/trace_api/get_actionsPaginated search over action traces across a block range with optional filters. The handler clamps the block window to
[block_num_start, block_num_start + trace-max-block-range - 1]and reports the actual window scanned on the response so clients can resume pagination by advancingblock_num_start.Request fields:
block_num_start0block_num_endUINT32_MAXreceiveract.receiver.accountact.account(contract code).actioninclude_notificationsfalsefalse, specifying exactly one ofreceiver/accountimplicitly constrains the other to the same value (canonical execution only, no notification copies). Whentrue, only the specified filters apply.Response rows carry the full action-trace execution-tree context (
action_ordinal,creator_action_ordinal,recv_sequence,auth_sequence,code_sequence,abi_sequence,account_ram_deltas, optional per-actioncpu_usage_us/net_usage), the ABI-decodedparamsandreturn_data(or adecode_errorfield with the raw hex when the ABI is unavailable), and the enclosing block / transaction context (trx_id,block_num,block_time,producer_block_id,block_status,trx_cpu_usage_us,trx_net_usage_words).block_statusis"irreversible"or"pending", sourced from the same data-log tuple that powersget_block'sstatusfield; operators wanting only-irreversible responses can run nodeop withread-mode = irreversible. Within a transaction actions are ordered byglobal_sequence(execution order).trx_cpu_usage_usandtrx_net_usage_wordsare the parent transaction's totals; they differ from the action-levelcpu_usage_us/net_usagein scope (whole trx vs single action) and units (actionnet_usageis bytes; trxnet_usage_wordsisceil(net_usage / 8)).POST /v1/trace_api/get_token_transfersConvenience preset of
get_actionswithreceiver = account = <token_contract>andaction = transfer. Defaultstoken_contracttosysio.token. Because it constrainsreceiver == account, each transfer appears exactly once — the canonical execution, not the inline notifications to sender and recipient.Returns a slim subset of
get_actionsfields optimized for exchange/indexer workflows (omits execution-tree ordinals, receipt sequences, RAM deltas, and CPU/NET usage);block_statusis retained because exchanges crediting transfers need finality. Useget_actionsdirectly if you need the full row shape.Breaking changes
--trace-rpc-abi,--trace-no-abis. Operators no longer supplyabi.jsonfiles.action_trace_v0gainsaction_ordinal,creator_action_ordinal,closest_unnotified_ancestor_action_ordinal,recv_sequence,auth_sequence,code_sequence,abi_sequence,account_ram_deltas, optionalcpu_usage_us/net_usage. Persisted trace format is incompatible with earlier builds; operators must delete the trace directory on upgrade.statusremoved fromtransaction_trace_v0.action->nameon actions,account->actoron authorizations.New config
--trace-max-block-range(default1000, clamped[1, 10000]) — silently capsblock_num_end - block_num_start + 1for the range-scanning query endpoints.On-disk layout (additions)
trace_trx_idx_<range>.logtrx_id_prefix64 -> block_numtrace_blk_idx_<range>.loguint64_t[stride]block_num -> trace offsettrace_recv_bloom_<range>.logboost::bloomfilters over actionreceiverand packed(receiver, action)pairs; probed byget_actionsfor O(1) slice-skipabi_log.log(account, global_seq, blob)+ CRC32Docs
Full endpoint reference, configuration table, on-disk layout, ABI versioning semantics, pagination guide, and exchange/indexer integration notes live in
plugins/trace_api_plugin/trace_api_plugin.md.