Skip to content

Optimize kv_idx_update: use modify instead of remove+create#282

Merged
heifner merged 4 commits into
masterfrom
feature/chainbase-optimizations
May 9, 2026
Merged

Optimize kv_idx_update: use modify instead of remove+create#282
heifner merged 4 commits into
masterfrom
feature/chainbase-optimizations

Conversation

@heifner
Copy link
Copy Markdown
Contributor

@heifner heifner commented Apr 3, 2026

Summary

  • kv_idx_update now uses db.modify() instead of db.remove() + db.create() when a secondary key changes
  • undo_index::post_modify rebalances only the affected AVL tree index, reducing tree operations from 6 to 2 per secondary index update
  • Eliminates a node deallocation/reallocation cycle and one redundant shared_blob allocation (pri_key no longer reassigned)
  • Caches get_deep_mind_logger result in kv_set update path to avoid redundant call

Performance

Estimated ~1-3us savings per secondary index update, scaling with table size (deeper AVL trees = more comparisons per operation). Most impactful for contracts with frequent secondary key modifications (orderbooks, inventories, sorted collections).

undo_index::post_modify handles AVL tree rebalancing when composite
index key fields change. This avoids node deallocation, fresh node
allocation, and reinsertion into all 3 AVL trees per secondary index
update. Added test proving modify correctly rekeys across indices.
Base automatically changed from feature/kv-remove-sso to master April 3, 2026 20:16
Copy link
Copy Markdown
Contributor

@huangminghuang huangminghuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Finding

The main behavioral risk is that db.modify preserves the kv_index_object
id, while the old remove/create implementation replaced it with a new id.

Secondary KV iterator handles cache both:

  • the current secondary and primary key bytes, used by the slow re-seek path
  • the chainbase object id, used by the fast iterator_to path

kv_idx_next and kv_idx_prev prefer the cached id when it is still valid.
After PR #282, a secondary iterator acquired before kv_idx_update can still
find the modified object by id, but that object may now be at a different
secondary-key position.

Example:

  1. A contract gets a secondary iterator handle for "alice".
  2. It calls kv_idx_update to change "alice" to "zebra".
  3. It calls kv_idx_next(handle).

With the old remove/create behavior, the cached id no longer existed, so
kv_idx_next fell back to the stored key bytes and resumed from the old
"alice" position. It would likely land on "bob".

With db.modify, the cached id still exists, so kv_idx_next advances from
the moved "zebra" position and can return end, skipping entries that were
after "alice".

This also affects kv_idx_prev, kv_idx_key, and kv_idx_primary_key.
For example, kv_idx_key can now return the new key for an old handle instead
of reporting iterator_erased, which differs from the old remove/create model.

Proposed Fix

Invalidate only affected secondary iterator caches before calling db.modify.
The important part is to clear only cached_id, not the stored key bytes or
iterator status. That forces existing handles to use the old key-based slow
path on their next operation.

Add a helper to kv_iterator_pool:

void invalidate_secondary_cache(account_name code, uint16_t table_id, int64_t object_id) {
   for (auto& s : _slots) {
      if (s.in_use &&
          !s.is_primary &&
          s.code == code &&
          s.table_id == table_id &&
          s.cached_id == object_id) {
         s.cached_id = -1;
      }
   }
}

Then call it in apply_context::kv_idx_update before modifying the object:

const auto modified_id = itr->id._id;

kv_iterators.invalidate_secondary_cache(receiver, table_id, modified_id);

db.modify(*itr, [&](auto& o) {
   o.payer = payer;
   o.sec_key.assign(new_sec_key, new_sec_key_size);
});

This preserves the old observable iterator behavior while still allowing
kv_idx_update to use db.modify.

Performance Impact

The cache invalidation should not erase the performance win that PR #282 is
aiming for.

The removed work is the expensive part: chainbase object removal, object
creation, allocation, full field assignment, new id creation, and delete/create
undo bookkeeping.

The proposed invalidation only scans the in-action KV iterator slot pool, which
is bounded by config::max_kv_iterators and is currently 1024. It only clears
slots whose cached id matches the object being modified. It does not touch
chainbase indexes and does not allocate objects.

In typical contract execution, the number of live iterators should be small.
Even in the worst case, this is a bounded linear scan over 1024 slots per
kv_idx_update, which is likely much cheaper than the old remove/create path.

Suggested Test

Add a contract-level test that exercises a live secondary iterator across
kv_idx_update:

  1. Store entries ordered as "alice", "bob", "charlie".
  2. Get a secondary iterator handle for "alice".
  3. Update "alice" to "zebra".
  4. Call kv_idx_next(handle).
  5. Verify it resumes according to the old "alice" position, not the moved
    "zebra" position.

A similar test for kv_idx_key(handle) after the update would also be useful
to confirm whether the API should continue reporting iterator_erased for the
old handle.

db.modify preserves the kv_index_object's chainbase id, so a live
secondary iterator that cached the id would advance from the
post-modify position and silently skip entries when sec_key changed.
The prior remove+create path worked because the new object had a
fresh id, so the cached id no longer resolved and iteration fell
back to the slow re-seek using stored key bytes.

Add kv_iterator_pool::invalidate_secondary_cache that clears cached_id
only on matching secondary slots, leaving stored key bytes and status
intact so the next op resumes from the old position. kv_idx_update
calls it before db.modify.
@heifner heifner requested a review from huangminghuang May 8, 2026 01:43
heifner added 2 commits May 8, 2026 11:12
…ptimizations

Resolve conflict in libraries/chain/apply_context.cpp at the kv_idx_update db.modify lambda.
Master's remove+create added o.table_id = table_id; the merged db.modify path drops the
assignment since idx.find already constrains table_id on the located row.

Adapt HEAD's invalidate_secondary_cache (added in 1146c38) to master's uint16_t table_id
kv API:
- kv_iterator_slot/invalidate_secondary_cache use (code, table_id) instead of (code, table, index_id)
- kv_idx_update call site updated accordingly
- kv_tests kv_index_modify_rekeys_correctly and kv_iterator_pool_invalidate_secondary_cache
  rewritten against the new schema; the by_code_table_idx_prikey lookup is replaced with a
  4-tuple find on by_code_table_id_seckey since master has only the seckey index.
The merge from master in cd02d6d brought commit 30b1697, which collapsed kv_index_object's (table, index_id) into a single uint16_t table_id and reduced kv_iterator_pool::{allocate_secondary, invalidate_secondary_cache} to 2/3 args. apply_context.cpp::kv_idx_update merged correctly; the two test cases added in 4fe6dfd and 1146c38 still referenced the old fields and signatures.

- kv_index_modify_rekeys_correctly: rewritten against table_id and by_code_table_id_seckey. Replaces the by_code_table_idx_prikey lookup (no longer a separate index) with a composite find on (sec_key + pri_key) plus an explicit "old key no longer resolves" check.
- kv_iterator_pool_invalidate_secondary_cache: 2-arg allocate_secondary, 3-arg invalidate_secondary_cache. Old (different table) vs (different index) slots collapse to "different table_id"; kept as two distinct table_ids so the loop preserves more than one non-matching slot.
@heifner heifner merged commit 39e8562 into master May 9, 2026
28 checks passed
@heifner heifner deleted the feature/chainbase-optimizations branch May 9, 2026 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants