rebuild fk reverse index after partition snapshot import by fabracht · Pull Request #56 · LabOverWire/MQDB

fabracht · 2026-05-04T00:15:11Z

Summary

Closes the "Known gap" called out in PR add partition snapshot exports for schema/index/unique/fk/constraint stores #54's CHANGELOG (mqdb-cluster 0.3.2). After a rebalance-driven replica promotion, the new primary received the imported db_data and FK constraints via the partition snapshot, but its in-memory FkReverseIndex cache stayed empty for the imported records. start_fk_reverse_lookup and handle_fk_reverse_lookup_request returned empty for any record sitting on a newly-imported partition — ON DELETE CASCADE missed children that the new primary owned, ON DELETE RESTRICT silently allowed deletes that should have been blocked.
StoreManager::import_partition now ends with a private rebuild_fk_indexes_after_import step. It iterates every registered FK constraint and calls the existing rebuild_fk_index_for_constraint helper, which walks db_data.list(source_entity) (now populated with the just-imported records) and seeds the reverse index. Mirrors the existing pattern in apply_db_constraint where constraint Insert via Raft replication triggers the same rebuild.
mqdb-cluster 0.3.3 → 0.3.4. CHANGELOG entry under 2026-05-03.

Why import-side only

StoreManager::clear_partition is never called anywhere in the codebase, so there is no demotion path that creates stale entries. The downstream is_primary_for_partition filter (node_controller/fk.rs:357-365) drops any reverse-index entry the node no longer claims primaryship for, so even theoretical staleness is masked. Tracked as future work if clear_partition ever gets wired in.

Discovered while running the new E2E (separate follow-up)

While running examples/cluster-rebalance-stores/run.sh with a fresh enterprise license, I observed via tracing that constraints don't reach all nodes uniformly. Across runs the leader (node 1) consistently held both unique + FK constraints locally; nodes 2/3 sometimes held a subset (e.g. node 2 had 0, node 3 had only the FK); a freshly-joined node 4 had 0. Constraints are routed through schema_partition(entity), so any node that doesn't own that partition has the constraint reachable only via forwarding — not in its local db_constraints.

This PR's rebuild_fk_indexes_after_import is correct in its scope (it rebuilds for whatever constraints the importing node has locally), but the broader cascade-through-rebalanced-node behavior is not solved by this PR alone — it's governed by constraint replication topology. The CHANGELOG documents this finding as a separate follow-up alongside the schema replication topology issue first noted in the 0.3.2 CHANGELOG entry.

Test plan

cargo test -p mqdb-cluster --lib — 466 → 478 cluster lib tests (12 additions):
- 4 direct FkReverseIndex unit tests in data_store.rs (insert/lookup/remove, idempotent inserts, removing unknown source ids, field-scoped keys).
- 7 update_fk_reverse_index and rebuild_fk_index_for_constraint unit tests in constraint_ops.rs (Insert/Update/Delete paths, no-op without constraint, malformed JSON skipped, no-op for non-FK constraint).
- 1 integration test import_partition_rebuilds_fk_reverse_index in partition_io.rs — verified by toggling the rebuild call: fails with left: [], right: ["c1"] when removed, passes when restored.
cargo make clippy — clean (pedantic, all targets + wasm).
cargo make format-check — clean.
cargo make dev — full workspace green.
Pre-commit hook (format-check + clippy) ran on each commit and passed.
examples/cluster-rebalance-stores/run.sh actually run with an enterprise license generated locally. Phase 2 now creates 20 extra child comments (2 per parent). Phase 5 ends with a cascade-via-node-4 observation (not a hard assertion, due to the constraint-replication finding above) showing the fraction of eligible cascades that completed; in runs where the constraint partition lands on a cascade-initiating primary, all eligible children are removed.

This was referenced May 4, 2026

constraints not replicated uniformly across cluster nodes #57

Open

schemas not replicated uniformly across cluster nodes #58

Open

fabracht added 3 commits May 6, 2026 19:17

rebuild fk reverse index after partition snapshot import

573893a

demote e2e cascade to observation after constraint replication discovery

e95d5e7

update docs and changelog for fk reverse index snapshot rebuild

99e740f

fabracht force-pushed the cluster-fk-reverse-index-snapshot branch from 6c76b28 to 99e740f Compare May 7, 2026 02:30

fabracht merged commit 6f232e4 into main May 7, 2026
5 checks passed

fabracht deleted the cluster-fk-reverse-index-snapshot branch May 7, 2026 14:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rebuild fk reverse index after partition snapshot import#56

rebuild fk reverse index after partition snapshot import#56
fabracht merged 3 commits intomainfrom
cluster-fk-reverse-index-snapshot

fabracht commented May 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fabracht commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why import-side only

Discovered while running the new E2E (separate follow-up)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fabracht commented May 4, 2026 •

edited

Loading