feat: implement Merkle Patricia Trie for state verification#88
feat: implement Merkle Patricia Trie for state verification#88SIDDHANTCOOKIE wants to merge 3 commits into
Conversation
|
Warning Review limit reached
More reviews will be available in 29 minutes and 23 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. WalkthroughAdds a Merkle Patricia Trie for deterministic state roots; Block headers now include ChangesState Root Commitment via Merkle Patricia Trie
Sequence DiagramsequenceDiagram
participant Miner as mine_and_process_block
participant TempState as temp_state (copy of State)
participant Trie as MPT Trie
participant Chain as Blockchain
Miner->>TempState: validate_and_apply(transactions)
TempState->>Trie: put(serialized accounts in deterministic order)
Trie->>TempState: return root_hash()
TempState->>Miner: state_root()
Miner->>Chain: add_block(Block(state_root, miner, txs))
Chain->>Chain: copy state -> temp_state, apply txs, compute temp_state.state_root()
Chain->>Chain: compare block.state_root == temp_state.state_root()
alt match
Chain->>Chain: accept block
else mismatch
Chain->>Chain: warn and reject
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
minichain/p2p.py (1)
182-200:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
state_rootis still optional at the schema boundary.This validator never actually requires the key to be present:
payload.get("state_root")returnsNone, andNoneis accepted by(str, type(None)). A peer can therefore omitstate_rootand still pass_validate_message(). Check that all required keys are present first, and validatestate_rootagainststrrather than(str, type(None)).Proposed fix
def _validate_block_payload(self, payload): if not isinstance(payload, dict): return False required_fields = { "index": int, "previous_hash": str, "merkle_root": (str, type(None)), - "state_root": (str, type(None)), + "state_root": str, "transactions": list, "timestamp": int, "difficulty": (int, type(None)), "nonce": int, "hash": str, } optional_fields = {"miner": str} allowed_fields = set(required_fields) | set(optional_fields) + if not set(required_fields).issubset(payload): + return False if not set(payload).issubset(allowed_fields): return False for field, expected_type in required_fields.items(): - if not isinstance(payload.get(field), expected_type): + if not isinstance(payload[field], expected_type): return False🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@minichain/p2p.py` around lines 182 - 200, The validator currently allows omitting required keys because it only checks that payload keys are a subset of allowed_fields and uses payload.get(...) which returns None (accepted for state_root because its expected type includes None); update _validate_message to explicitly verify all required keys are present (e.g., assert set(required_fields).issubset(payload) or loop to check membership before type checks) and change the expected type for "state_root" in required_fields from (str, type(None)) to str so missing or None values fail validation; keep the existing allowed_fields check but perform the required-key presence check before the isinstance validations that iterate required_fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@minichain/chain.py`:
- Around line 123-126: The current verification uses temp_state after replaying
transactions only, but you must commit and verify the post-reward state
(including miner/reward) so all nodes reach the same state_root; update the
block acceptance path (the code using block.state_root, temp_state.state_root(),
and chain.state) to apply the miner reward to temp_state before computing and
comparing state_root, ensure the miner/reward fields are included in the hashed
block payload used to derive state_root, and only commit chain.state after
verifying that temp_state.state_root() (after reward) matches block.state_root.
In `@minichain/mpt.py`:
- Around line 8-10: The trie key conversion can crash on non-hex account keys;
add explicit hex-format validation and graceful errors: update to_nibbles to
first ensure key_hex is a str of hex characters (e.g., via a regex or
int(...,16) guard) and raise a clear ValueError mentioning the offending key,
and/or add a validation call from State.state_root before inserting keys to
confirm all keys are valid hex; also enforce hex checks at ingress points by
calling the existing is_valid_receiver (or similar) from _validate_sync_payload,
persistence.load/genesis allocation loading, and wherever accounts are inserted
so invalid addresses are rejected with a clear error instead of crashing in
to_nibbles.
In `@minichain/state.py`:
- Around line 16-27: state_root currently rebuilds a new Trie from self.accounts
on every call (using Trie(), trie.put(...), trie.root_hash()) which is O(N) per
invocation; change this to maintain the Trie incrementally by storing a
persistent trie instance (e.g., self._trie) and its cached root (e.g.,
self._cached_state_root) that are updated inside mutation points such as
apply_transaction and any methods that modify self.accounts (and
invalidated/updated in add_block when a rollback/alternative path occurs) so
state_root simply returns the cached root; ensure Trie.put/delete calls are
performed in apply_transaction (or a helper like _update_trie_for_account) and
that mutations clear or update the cached root to keep consistency.
- Around line 16-27: state_root() builds the committed MPT from self.accounts
including lazily-created empty accounts from State.get_account(), causing
divergent roots; update state_root (in class State) to skip inserting accounts
that are "empty" (e.g., default-created accounts with zero balance, zero nonce
and no storage) when iterating self.accounts before calling Trie.put, preserving
the sorted iteration and existing use of Trie.put/root_hash so only non-empty
accounts are committed to the trie.
---
Outside diff comments:
In `@minichain/p2p.py`:
- Around line 182-200: The validator currently allows omitting required keys
because it only checks that payload keys are a subset of allowed_fields and uses
payload.get(...) which returns None (accepted for state_root because its
expected type includes None); update _validate_message to explicitly verify all
required keys are present (e.g., assert set(required_fields).issubset(payload)
or loop to check membership before type checks) and change the expected type for
"state_root" in required_fields from (str, type(None)) to str so missing or None
values fail validation; keep the existing allowed_fields check but perform the
required-key presence check before the isinstance validations that iterate
required_fields.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a1a8b307-ffef-4f83-939b-767a0e896c67
📒 Files selected for processing (8)
main.pyminichain/block.pyminichain/chain.pyminichain/mpt.pyminichain/p2p.pyminichain/state.pytests/test_persistence.pytests/test_protocol_hardening.py
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
main.py (1)
149-157:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winRemove the redundant mining-reward credit —
add_blocknow credits it, so received blocks are double-credited.
chain.add_blockalready credits the miner reward into the committed state (minichain/chain.pylines 123-124) and commits that post-rewardtemp_state. Crediting again here adds the reward a second time, so a receiving node'schain.stateno longer matches the block'sstate_root. Every subsequent block then fails thestate_rootcheck, permanently desyncing receiving nodes from the miner. Whenblock.minerisNone, the fallback also creditsBURN_ADDRESS, corrupting state the same way.🐛 Proposed fix — drop the post-acceptance credit
if chain.add_block(block): logger.info("📥 Received Block #%d — added to chain", block.index) - # Apply mining reward for the remote miner (burn address as placeholder) - miner = payload.get("miner", BURN_ADDRESS) - chain.state.credit_mining_reward(miner) - # Drop only confirmed transactions so higher nonces can remain queued. mempool.remove_transactions(block.transactions)Note:
BURN_ADDRESS(line 34) may become unused after this removal.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@main.py` around lines 149 - 157, Received blocks are being credited twice: chain.add_block already applies the miner reward and commits temp_state, so remove the redundant post-acceptance credit. Delete or stop calling chain.state.credit_mining_reward(miner) in the block-receive path (the block acceptance branch where logger.info("📥 Received Block #%d — added to chain") is logged) and leave mempool.remove_transactions(block.transactions) intact; ensure you still derive miner with payload.get("miner", BURN_ADDRESS) only if needed elsewhere (BURN_ADDRESS may become unused and can be cleaned up).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@minichain/mpt.py`:
- Around line 10-13: The current except block discards the original parse
exception which loses diagnostic details; update the handler around the list
comprehension that parses key_hex (the try/except that returns [int(c, 16) for c
in key_hex]) to capture the original ValueError (e.g., except ValueError as e)
and re-raise the new ValueError using exception chaining (raise
ValueError(f"Invalid MPT key: '{key_hex}'. Keys must be valid hex strings.")
from e) so the original parse failure is preserved for debugging.
---
Outside diff comments:
In `@main.py`:
- Around line 149-157: Received blocks are being credited twice: chain.add_block
already applies the miner reward and commits temp_state, so remove the redundant
post-acceptance credit. Delete or stop calling
chain.state.credit_mining_reward(miner) in the block-receive path (the block
acceptance branch where logger.info("📥 Received Block #%d — added to chain") is
logged) and leave mempool.remove_transactions(block.transactions) intact; ensure
you still derive miner with payload.get("miner", BURN_ADDRESS) only if needed
elsewhere (BURN_ADDRESS may become unused and can be cleaned up).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 59325ee2-e2e3-4516-b073-72aeca3f7c90
📒 Files selected for processing (6)
main.pyminichain/block.pyminichain/chain.pyminichain/mpt.pyminichain/p2p.pyminichain/state.py
| try: | ||
| return [int(c, 16) for c in key_hex] | ||
| except ValueError: | ||
| raise ValueError(f"Invalid MPT key: '{key_hex}'. Keys must be valid hex strings.") |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Preserve the original parse failure when rethrowing.
This keeps malformed-key errors easier to diagnose without changing behavior.
Proposed fix
- try:
- return [int(c, 16) for c in key_hex]
- except ValueError:
- raise ValueError(f"Invalid MPT key: '{key_hex}'. Keys must be valid hex strings.")
+ try:
+ return [int(c, 16) for c in key_hex]
+ except ValueError as err:
+ raise ValueError(
+ f"Invalid MPT key: '{key_hex}'. Keys must be valid hex strings."
+ ) from err🧰 Tools
🪛 Ruff (0.15.14)
[warning] 13-13: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
[warning] 13-13: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@minichain/mpt.py` around lines 10 - 13, The current except block discards the
original parse exception which loses diagnostic details; update the handler
around the list comprehension that parses key_hex (the try/except that returns
[int(c, 16) for c in key_hex]) to capture the original ValueError (e.g., except
ValueError as e) and re-raise the new ValueError using exception chaining (raise
ValueError(f"Invalid MPT key: '{key_hex}'. Keys must be valid hex strings.")
from e) so the original parse failure is preserved for debugging.
There was a problem hiding this comment.
Understood, @SIDDHANTCOOKIE — no problem, I'll leave it as-is.
ʕ•ᴥ•ʔ
Addressed Issues:
This PR implements Feature 1 from the roadmap, giving MiniChain cryptographic verifiability. This ensures that all nodes on the network mathematically agree on the exact global state of account balances and smart contract data.
Screenshots/Recordings:
TODO: If applicable, add screenshots or recordings that demonstrate the interface before and after the changes.
Additional Notes:
mpt.py) supporting Leaf, Extension, and Branch nodes.Stateto dynamically compute and return a cryptographicstate_rootvia the MPT upon block evaluation.state_rootto theBlockheader, and enforced that incoming blocks must perfectly match the locally computed state root duringadd_block().AI Usage Disclosure:
We encourage contributors to use AI tools responsibly when creating Pull Requests. While AI can be a valuable aid, it is essential to ensure that your contributions meet the task requirements, build successfully, include relevant tests, and pass all linters. Submissions that do not meet these standards may be closed without warning to maintain the quality and integrity of the project. Please take the time to understand the changes you are proposing and their impact. AI slop is strongly discouraged and may lead to banning and blocking. Do not spam our repos with AI slop.
Check one of the checkboxes below:
I have used the following AI models and tools: TODO
Checklist
Summary by CodeRabbit
New Features
Bug Fixes
Tests