Skip to content

Thread &dyn Crypto through all keccak call sites in block validation#6365

Open
ilitteri wants to merge 2 commits intomainfrom
thread-crypto-through-keccak
Open

Thread &dyn Crypto through all keccak call sites in block validation#6365
ilitteri wants to merge 2 commits intomainfrom
thread-crypto-through-keccak

Conversation

@ilitteri
Copy link
Collaborator

Motivation

ZisK zkVM team reported ethrex produces only 1,168 keccak precompile ops vs Reth's 28,228 for the same block (block 24640819), causing a 3.3x RISC-V cycle penalty (286M vs 87M cycles). Root cause: only ~2 of ~86 keccak call sites went through the Crypto trait that zkVM teams can override — the rest called keccak_hash() directly, which on RISC-V falls back to tiny_keccak instead of using the zkVM's keccak circuit accelerator.

Description

Passes &dyn Crypto as a function parameter through every function in the keccak call chain, bottom-up from leaf functions:

  • Trie node hashing: NodeHash::from_encoded, NodeHash::finalize, {Leaf,Extension,Branch}Node::compute_hash_no_alloc, NodeRef::commit, Trie::hash_no_commit, Trie::compute_hash_from_unsorted_iter, Trie::collect_changes_since_last_hash
  • Block validation: compute_transactions_root, compute_receipts_root, compute_withdrawals_root, validate_block_body, BlockHeader::compute_block_hash, bloom_from_logs, Receipt::encode_inner_with_bloom
  • Account/type hashing: Code::from_bytecode, code_hash, compute_storage_root
  • Guest program state: hash_address, hash_key, apply_account_updates, state_trie_root, initialize_block_header_hashes, get_first_invalid_block_hash, get_block_hash, get_account_state, get_storage_slot, get_valid_storage_trie, GuestProgramState::from_witness
  • Guest execution entry: execute_blocks receives Arc<dyn Crypto + Send + Sync> and threads it through all validation, trie, and state operations

Non-guest callers (blockchain, storage, P2P, RPC, CLI) pass &NativeCrypto — a zero-sized struct the compiler devirtualizes to zero overhead.

GuestProgramStateWrapper stores Arc<dyn Crypto + Send + Sync> to bridge the VmDatabase trait (which can't take crypto params) with underlying methods that need crypto for hash_address/hash_key/get_valid_storage_trie/get_block_hash/get_account_state/get_storage_slot.

How to Test

  • cargo check --workspace passes
  • cargo test --package ethrex-common --package ethrex-trie — all 112 tests pass
  • Grep for keccak_hash( in crates/guest-program/ — only ECRECOVER crypto internals remain (expected, these are inside the Crypto trait impl itself)
  • Grep for NativeCrypto in crates/common/ and crates/vm/levm/ — only in error formatting paths, tests, RLP encoding (safe due to memoization), and native-only code (genesis, trie_sorted, verify_range, logger)

Closes

N/A — addresses ZisK team feedback on keccak precompile utilization

…ion path

ZisK zkVM team reported ethrex produces only 1,168 keccak precompile ops
vs Reth's 28,228 for the same block, causing a 3.3x RISC-V cycle penalty.
Root cause: only ~2 of ~86 keccak call sites went through the Crypto trait
that zkVM teams can override.

This change passes &dyn Crypto as a function parameter through every
function in the keccak call chain — from trie node hashing (NodeHash,
LeafNode, ExtensionNode, BranchNode), through Trie root computation,
block validation (transactions/receipts/withdrawals roots, bloom, block
hash), account hashing (code_hash, storage_root), to the guest program
execution entry point (execute_blocks).

Non-guest callers pass &NativeCrypto (zero-sized struct that the compiler
can devirtualize and inline for zero overhead). The GuestProgramStateWrapper
stores Arc<dyn Crypto + Send + Sync> to bridge the VmDatabase trait
(which cannot take crypto params) with the underlying methods that now
need crypto for hash_address/hash_key/get_valid_storage_trie.
1. Receipt::encode_inner_with_bloom hardcoded NativeCrypto for bloom
   computation. Every receipt's log addresses and topics were hashed
   with NativeCrypto during validate_receipts_root, bypassing the
   zkVM accelerator. Now takes crypto parameter.

2. GuestProgramState initialization (from ExecutionWitness) hashed
   all state trie nodes, storage trie nodes, and contract code with
   NativeCrypto. These are the most expensive keccak operations in
   block validation. Replaced TryFrom with from_witness(witness, crypto).

3. get_first_invalid_block_hash and get_block_hash called header.hash()
   which falls back to NativeCrypto. Now take crypto and use
   compute_block_hash(crypto) directly, removing the fragile ordering
   dependency on initialize_block_header_hashes.
@ilitteri ilitteri requested review from a team, ManuelBilbao and avilagaston9 as code owners March 13, 2026 21:01
Copilot AI review requested due to automatic review settings March 13, 2026 21:01
@github-actions
Copy link

🤖 Kimi Code Review

Review Summary

This PR introduces a significant architectural change by making cryptographic operations pluggable through a Crypto trait instead of hard-coding NativeCrypto. The changes are extensive and touch many files, but the implementation appears consistent and follows good practices.

Key Observations

  1. Architecture: The change from hard-coded NativeCrypto to a pluggable Crypto trait is well-executed. This enables:

    • ZK-friendly implementations for guest programs
    • Test implementations with different hashing
    • Better separation of concerns
  2. Consistency: The changes are applied consistently across the codebase, including:

    • All trie operations now take a crypto parameter
    • Block validation functions now accept a crypto parameter
    • Receipt and transaction root calculations use the provided crypto
  3. Security: No obvious security issues introduced. The cryptographic operations remain deterministic and consistent.

Minor Issues

  1. Line 736 in cmd/ethrex/cli.rs: The validate_block_body call passes &ethrex_crypto::NativeCrypto but this could be parameterized for consistency with other changes.

  2. Line 854 in cmd/ethrex/cli.rs: Same as above - could use a parameterized crypto instead of hard-coded NativeCrypto.

  3. Line 47 in crates/common/trie/node.rs: In PartialEq implementation for NodeRef, it uses &NativeCrypto directly. This could be inconsistent if the crypto implementation changes.

  4. Line 65 in crates/common/types/receipt.rs: The ReceiptWithBloom::from method uses &ethrex_crypto::NativeCrypto directly, which might not be consistent with the pluggable approach.

Recommendations

  1. Consider creating a DefaultCrypto type alias or constant in appropriate modules to avoid scattering NativeCrypto references throughout the codebase.

  2. For the guest programs, ensure the Crypto trait implementation is properly zeroized or cleared to prevent side-channel attacks in ZK contexts.

  3. Add documentation to the Crypto trait clarifying the security requirements for implementations, especially for ZK contexts.

  4. Consider adding tests that verify different Crypto implementations produce consistent results for the same inputs.

Overall, this is a solid architectural improvement that maintains correctness while adding flexibility. The changes are extensive but appear to be implemented correctly throughout the codebase.


Automated review by Kimi (Moonshot AI) · custom prompt

@github-actions
Copy link

🤖 Codex Code Review

  1. High: Crypto provider is still hardcoded inside trie node encoding, so custom Crypto is not fully respected.
    Paths:

    BranchNode/ExtensionNode encoding now always uses NativeCrypto for child hashes. That bypasses the injected provider and can produce inconsistent trie hashes when a non-native provider is used (or skip expected provider-specific behavior).

  2. High: NodeRef hash memoization is provider-agnostic (OnceLock<NodeHash>), but hash computation is now provider-dependent.
    Paths:

    A hash computed once (possibly via hardcoded NativeCrypto) is cached and reused for later calls with a different provider. This can silently return wrong hashes in mixed/provider-custom contexts.

  3. Medium: BlockHeader::hash() remains permanently tied to NativeCrypto, while other paths now use compute_block_hash(crypto).
    Paths:

    This creates a cache/coherency footgun: if hash() is called before custom-hash initialization/validation, set_hash_or_validate can fail or compare against a value derived from a different provider.

Overall: most changes are consistent mechanical propagation, but the trie/hash-cache/provider interactions above are correctness-critical and should be fixed before merge.


Automated review by OpenAI Codex · custom prompt

@github-actions
Copy link

Lines of code report

Total lines added: 64
Total lines removed: 2
Total lines changed: 66

Detailed view
+-------------------------------------------------------+-------+------+
| File                                                  | Lines | Diff |
+-------------------------------------------------------+-------+------+
| ethrex/crates/blockchain/payload.rs                   | 736   | +2   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/trie/logger.rs                   | 46    | +1   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/trie/node.rs                     | 362   | +10  |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/trie/node/branch.rs              | 572   | +2   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/trie/node/extension.rs           | 517   | +2   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/trie/node/leaf.rs                | 304   | +2   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/trie/rlp.rs                      | 137   | +1   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/trie/trie.rs                     | 537   | +7   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/trie/trie_sorted.rs              | 431   | +1   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/types/account.rs                 | 325   | -2   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/types/block.rs                   | 953   | +3   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/types/block_execution_witness.rs | 419   | +13  |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/validation.rs                    | 183   | +2   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/guest-program/src/common/execution.rs   | 135   | +2   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/guest-program/src/l1/program.rs         | 39    | +1   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/guest-program/src/l2/program.rs         | 61    | +1   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/peer_handler.rs          | 547   | +1   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/snap/client.rs           | 1183  | +1   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/sync/healing/state.rs    | 390   | +1   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/sync/healing/storage.rs  | 628   | +1   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/vm/witness_db.rs                        | 99    | +10  |
+-------------------------------------------------------+-------+------+

@github-actions
Copy link

🤖 Claude Code Review

Now I have a comprehensive view of the PR. Let me compile the review.


PR Review: Thread &dyn Crypto through keccak call sites in block validation

Summary

This PR correctly addresses a significant zkVM performance gap by routing keccak hashing through the Crypto trait. The approach is sound — NativeCrypto is a ZST that gets devirtualized by the compiler, so there is zero overhead for non-guest callers. The changes are mechanically consistent and the test coverage is maintained. Overall this is a good refactor.


Issues

1. GuestProgramStateWrapper: redundant crypto parameter across interface boundary

witness_db.rs stores Arc<dyn Crypto + Send + Sync> in the struct to satisfy the VmDatabase trait, but the non-trait methods (apply_account_updates, state_trie_root, get_first_invalid_block_hash, initialize_block_header_hashes) still accept &dyn Crypto as a separate parameter and ignore self.crypto:

// witness_db.rs:32-38
pub fn apply_account_updates(
    &mut self,
    account_updates: &[AccountUpdate],
    crypto: &dyn Crypto,  // redundant – self.crypto is already set
) -> Result<(), GuestProgramStateError> {
    self.lock_mutex()?.apply_account_updates(account_updates, crypto)
}

All call sites in execution.rs pass the same crypto anyway. These methods should use self.crypto.as_ref() directly, just like the VmDatabase impl methods already do. The current split is internally inconsistent and makes the API harder to use safely.


2. PartialEq for NodeRef hardcodes NativeCrypto

// crates/common/trie/node.rs:352-355
impl PartialEq for NodeRef {
    fn eq(&self, other: &Self) -> bool {
        let mut buf = Vec::new();
        self.compute_hash_no_alloc(&mut buf, &NativeCrypto)
            == other.compute_hash_no_alloc(&mut buf, &NativeCrypto)
    }
}

Equality comparisons on trie nodes will never use the zkVM accelerator. In a hot path (e.g., trie diff/merge logic), this is a missed optimization. Since PartialEq cannot take an extra crypto parameter, a pragmatic fix is acceptable, but it should at least be documented with a comment explaining why NativeCrypto is used here.


3. RLPEncode for BranchNode/ExtensionNode hardcodes NativeCrypto

// crates/common/trie/rlp.rs:19-28
impl RLPEncode for BranchNode {
    fn encode(&self, buf: &mut dyn bytes::BufMut) {
        let payload_len = self.choices.iter().fold(value_len, |acc, child| {
            acc + RLPEncode::length(child.compute_hash_ref(&NativeCrypto))  // hardcoded
        });

The PR author notes this is "safe due to memoization" — correct, because NodeRef::commit processes children before encoding the parent, so the OnceLock hashes are already set by the time RLPEncode calls compute_hash_ref. However, if encode() is ever called on an uncommitted node outside of the commit path (e.g., in tests, debug output, or new callers), NativeCrypto will be used and the hash will be memoized with the native implementation. This blocks the zkVM accelerator for that node permanently within its lifetime. A comment should document this invariant.


4. hash_no_commit called purely for side effects

// crates/common/types/block_execution_witness.rs:194-196
state_trie.hash_no_commit(crypto);  // return value discarded
// ...
storage_trie.hash_no_commit(crypto);  // return value discarded

The intent is to trigger hash memoization across the trie before use. This is an anti-pattern — a function named hash_* is being called to mutate internal state rather than get a hash. A #[must_use] attribute on hash_no_commit would make this a compile warning. Either add let _ = trie.hash_no_commit(crypto) to signal intent, or introduce a dedicated memoize_hashes(crypto) method on Trie.


5. TryFrom<ExecutionWitness> is removed without a deprecation path

The TryFrom<ExecutionWitness> for GuestProgramState impl is replaced by GuestProgramState::from_witness(value, crypto). This is a breaking change to a public API. Since TryFrom can't carry extra parameters, the change is architecturally correct, but it silently breaks any external code using .try_into() on ExecutionWitness. Consider adding a #[deprecated] shim or a note in the changelog.


6. Minor: double finalize in error paths uses NativeCrypto

Error formatting calls like:

// crates/common/trie/node/branch.rs:425-427
child_ref.compute_hash(&NativeCrypto).finalize(&NativeCrypto),
self.compute_hash(&NativeCrypto).finalize(&NativeCrypto),

These are only hit when a trie inconsistency is detected and never in the normal execution path. The NativeCrypto usage here is fine since it's only for diagnostic error messages. No issue.


Correctness Assessment

  • Hash computation correctness is preserved: NativeCrypto and zkVM crypto must produce identical hash values (they implement the same algorithm), so the OnceLock memoization is safe regardless of which implementation runs first.
  • The BlockHeader::hash() OnceLock shortcut (always using NativeCrypto) is consistent with the guest program calling compute_block_hash(crypto) directly — the two results are equal, and memoization won't cause stale incorrect values.
  • The bloom_from_logs address/topic byte conversion (keccak_hash(log.address)crypto.keccak256(log.address.as_bytes())) is semantically identical — both pass the same underlying bytes.

Verdict

The core goal is achieved correctly. The main actionable items are Items 1 and 3 (API consistency and the RLP encoding invariant documentation). Item 2 (PartialEq) and Item 4 (side-effect pattern) are lower priority but worth addressing for maintainability.


Automated review by Claude (Anthropic) · custom prompt

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Threads &dyn Crypto through trie hashing and block/receipt root computation so zkVM targets can override keccak (avoiding native tiny_keccak fallbacks) while keeping native callers on NativeCrypto.

Changes:

  • Adds &dyn Crypto parameters across trie hashing APIs (NodeHash/Node/Trie) and block validation/root helpers.
  • Updates guest execution to carry Arc<dyn Crypto + Send + Sync> through witness/state initialization and validation.
  • Adjusts native callers (RPC/P2P/storage/CLI/LEVM) to pass &NativeCrypto at updated call sites.

Reviewed changes

Copilot reviewed 41 out of 41 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
crates/vm/witness_db.rs Store crypto in wrapper; pass into DB calls
crates/vm/levm/src/utils.rs Use VM crypto for Code::from_bytecode
crates/vm/levm/src/execution_handlers.rs Use VM crypto for contract code hashing
crates/vm/levm/runner/src/main.rs Runner uses NativeCrypto for code hashing
crates/vm/levm/runner/src/input.rs Runner input uses NativeCrypto for code hashing
crates/vm/backends/levm/mod.rs Native backend uses NativeCrypto for code hashing
crates/storage/store.rs Trie hashing now passes NativeCrypto
crates/networking/rpc/types/receipt.rs RPC bloom uses NativeCrypto
crates/networking/rpc/types/payload.rs RPC payload roots use NativeCrypto
crates/networking/rpc/eth/block.rs Receipt encoding uses NativeCrypto
crates/networking/p2p/sync/healing/storage.rs Trie node hash checks use NativeCrypto
crates/networking/p2p/sync/healing/state.rs Trie node hash checks use NativeCrypto
crates/networking/p2p/snap/client.rs Snap node hash checks use NativeCrypto
crates/networking/p2p/peer_handler.rs Block body validation accepts crypto param
crates/l2/utils/state_reconstruct.rs State root computation passes NativeCrypto
crates/l2/sequencer/l1_committer.rs State root computation passes NativeCrypto
crates/guest-program/src/l2/program.rs Thread crypto arc into execution entry
crates/guest-program/src/l1/program.rs Thread crypto arc into execution entry
crates/guest-program/src/common/execution.rs execute_blocks now carries crypto arc
crates/common/validation.rs Receipts root validation takes crypto param
crates/common/types/transaction.rs Update tests to pass NativeCrypto
crates/common/types/receipt.rs Bloom/receipt encoding threaded with crypto
crates/common/types/genesis.rs Genesis roots computed with NativeCrypto
crates/common/types/block_execution_witness.rs Witness→state init + hashing now takes crypto
crates/common/types/block.rs Roots + header hash compute now takes crypto
crates/common/types/account.rs Code/storage hashing now takes crypto
crates/common/trie/verify_range.rs Trie hash calls now pass NativeCrypto
crates/common/trie/trie_sorted.rs Trie-sorted hashing passes NativeCrypto
crates/common/trie/trie.rs Trie commit/hash APIs require crypto param
crates/common/trie/rlp.rs RLP encoding consults hashes via NativeCrypto
crates/common/trie/node_hash.rs NodeHash hashing/finalize now takes crypto
crates/common/trie/node/leaf.rs Leaf hashing now takes crypto (tests updated)
crates/common/trie/node/extension.rs Extension hashing now takes crypto (tests updated)
crates/common/trie/node/branch.rs Branch hashing now takes crypto (tests updated)
crates/common/trie/node.rs NodeRef/Node hashing & commit now takes crypto
crates/common/trie/logger.rs Logger hashes via NativeCrypto
crates/common/trie/db.rs DB embedding commit now passes NativeCrypto
crates/common/common.rs Re-export NativeCrypto from common crate
crates/blockchain/payload.rs Payload roots/bloom computed with NativeCrypto
crates/blockchain/blockchain.rs Receipt root validation + trie hashing uses NativeCrypto
cmd/ethrex/cli.rs CLI import validation passes NativeCrypto

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +184 to +186
pub fn hash(&mut self, crypto: &dyn Crypto) -> Result<H256, TrieError> {
self.commit(crypto)?;
Ok(self.hash_no_commit(crypto))
Comment on lines 32 to +49
pub fn apply_account_updates(
&mut self,
account_updates: &[AccountUpdate],
crypto: &dyn Crypto,
) -> Result<(), GuestProgramStateError> {
self.lock_mutex()?.apply_account_updates(account_updates)
self.lock_mutex()?
.apply_account_updates(account_updates, crypto)
}

pub fn state_trie_root(&self) -> Result<H256, GuestProgramStateError> {
self.lock_mutex()?.state_trie_root()
pub fn state_trie_root(&self, crypto: &dyn Crypto) -> Result<H256, GuestProgramStateError> {
self.lock_mutex()?.state_trie_root(crypto)
}

pub fn get_first_invalid_block_hash(&self) -> Result<Option<u64>, GuestProgramStateError> {
self.lock_mutex()?.get_first_invalid_block_hash()
pub fn get_first_invalid_block_hash(
&self,
crypto: &dyn Crypto,
) -> Result<Option<u64>, GuestProgramStateError> {
self.lock_mutex()?.get_first_invalid_block_hash(crypto)
@greptile-apps
Copy link

greptile-apps bot commented Mar 13, 2026

Greptile Summary

This PR addresses a significant zkVM performance regression: only ~2 of ~86 keccak call sites were routed through the Crypto trait, causing ZisK's keccak circuit to be used far less than expected (~1 k vs ~28 k precompile ops) and resulting in a 3.3× RISC-V cycle penalty. The fix threads &dyn Crypto / Arc<dyn Crypto + Send + Sync> bottom-up through all trie, block-validation, account-hashing, and guest-execution call sites. Non-guest callers (blockchain, storage, P2P, RPC, CLI) are updated to pass &NativeCrypto, a ZST that the compiler devirtualizes to zero overhead.

Key changes and points to note:

  • NodeHash::from_encoded / finalize now take &dyn Crypto — this is the lowest-level keccak call for trie hashing, so routing it through the trait correctly captures the bulk of the keccak operations.
  • GuestProgramState::TryFrom removed — replaced with GuestProgramState::from_witness(value, crypto). The only call site (execution.rs) is updated correctly.
  • rlp.rs / trie_sorted.rs / verify_range.rs still use NativeCrypto directly — acknowledged intentionally; the RLP encoding path is safe because children are always memoized by the time encode() is invoked from commit(). An implicit invariant worth documenting.
  • NodeRef::get_node_checked still uses NativeCrypto — the hash-validation step when loading a DB-backed node is not threaded. In the zkVM path with a complete witness this code is never reached, but it is a residual missed call site worth noting.
  • GuestProgramStateWrapper stores self.crypto for VmDatabase methods but exposes separate crypto: &dyn Crypto parameters on four other methods — in practice all callers (execution.rs) pass the same Arc, making this redundant rather than incorrect, but it creates an inconsistent API surface.

Confidence Score: 4/5

  • This PR is safe to merge — all behavioral changes are additive and every keccak call still produces correct Keccak-256 output regardless of which crypto path is taken.
  • The change is mechanically systematic (threading a parameter through call chains) and the 112 existing tests pass. The main correctness risk — that NativeCrypto and the injected zkVM crypto produce different hash values — cannot occur because they both implement standard Keccak-256. Residual NativeCrypto usages in rlp.rs and node.rs are either in acknowledged native-only paths or protected by memoization. The two issues flagged (get_node_checked NativeCrypto, GuestProgramStateWrapper API redundancy) are performance/design concerns rather than correctness bugs. Score is 4 rather than 5 because the implicit memoization invariant in rlp.rs is undocumented and could be violated by future callers of RLPEncode for BranchNode/ExtensionNode.
  • Pay close attention to crates/common/trie/node.rs (get_node_checked NativeCrypto usage) and crates/vm/witness_db.rs (redundant dual-crypto API design).

Important Files Changed

Filename Overview
crates/guest-program/src/common/execution.rs Core guest execution entry point — threads Arc<dyn Crypto + Send + Sync> through the full execution pipeline; all crypto calls consistently use the same Arc instance.
crates/vm/witness_db.rs GuestProgramStateWrapper stores self.crypto (Arc) for VmDatabase trait methods, but also exposes separate explicit crypto: &dyn Crypto params on apply_account_updates, state_trie_root, get_first_invalid_block_hash, and initialize_block_header_hashes — creating a dual-path design that is redundant but consistent in practice.
crates/common/trie/node.rs Most methods correctly thread &dyn Crypto, but get_node_checked (hash-verification on DB load) still hardcodes NativeCrypto, caching the hash without going through the injected circuit accelerator.
crates/common/trie/rlp.rs BranchNode and ExtensionNode RLP encoding hardcodes NativeCrypto for child hash refs. Safe in the normal commit() path (children already memoized), but would bypass the injected crypto if encode() is called before commit().
crates/common/types/block_execution_witness.rs Converts from TryFrom impl to GuestProgramState::from_witness(value, crypto); all hash operations (address hashing, code hashing, storage trie hashing) correctly use the injected crypto.
crates/common/trie/node_hash.rs from_encoded and finalize now take &dyn Crypto; correctly delegates all keccak operations to the injected trait.
crates/common/trie/trie.rs hash, hash_no_commit, commit, collect_changes_since_last_hash, and compute_hash_from_unsorted_iter all now thread &dyn Crypto; residual NativeCrypto uses are in error-message paths where the hash is already cached.
crates/common/types/block.rs compute_transactions_root, compute_receipts_root, compute_withdrawals_root, validate_block_body, and BlockHeader::compute_block_hash now accept &dyn Crypto; BlockHeader::hash() still memoizes with NativeCrypto (intentional safe path).
crates/blockchain/blockchain.rs All trie commit/hash calls updated to pass &NativeCrypto consistently; no zkVM usage here so NativeCrypto is correct.

Sequence Diagram

sequenceDiagram
    participant EP as execute_blocks<br/>(guest-program)
    participant GW as GuestProgramStateWrapper
    participant GPS as GuestProgramState
    participant Trie as Trie / NodeRef
    participant NH as NodeHash
    participant C as &dyn Crypto<br/>(zkVM circuit)

    EP->>GPS: from_witness(witness, crypto)
    GPS->>Trie: hash_no_commit(crypto)
    Trie->>Trie: memoize_hashes(buf, crypto)
    Trie->>NH: from_encoded(buf, crypto)
    NH->>C: keccak256(encoded_node)
    C-->>NH: hash[32]
    NH-->>Trie: NodeHash (cached in OnceLock)

    EP->>GW: new(state, crypto.clone())
    EP->>GW: initialize_block_header_hashes(blocks, crypto)
    EP->>GW: validate_block_body(..., crypto)
    EP->>GW: apply_account_updates(updates, crypto)
    GW->>GPS: apply_account_updates(updates, crypto)
    GPS->>C: keccak256(address) — hash_address
    GPS->>C: keccak256(key) — hash_key
    GPS->>Trie: hash_no_commit(crypto)
    Trie->>C: keccak256(encoded_node)

    EP->>GW: state_trie_root(crypto)
    GW->>GPS: state_trie_root(crypto)
    GPS->>Trie: hash_no_commit(crypto)
    Trie-->>EP: H256 (final state root)
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/common/trie/node.rs
Line: 87-88

Comment:
**`get_node_checked` still uses `NativeCrypto` — missed keccak call site**

`get_node_checked` verifies a loaded node's hash with `NativeCrypto` before returning it. Two consequences:

1. **Performance**: if this branch is hit during zkVM execution (e.g. an incomplete witness that needs a DB lookup), the keccak is computed outside the zkVM's circuit accelerator, defeating the purpose of this PR for that node.
2. **Cache poisoning**: `node.compute_hash(&NativeCrypto)` calls `memoize_hashes` internally and writes the result into the node's `OnceLock`. If the same `NodeRef::Node` is later `commit()`-ed with the injected crypto, the `OnceLock` is already set (from the `NativeCrypto` path) and the injected crypto is never used for that node.

Both crypto implementations produce the same Keccak-256 output, so there is **no correctness bug** in normal operation. But if a properly-formed witness is incomplete (any hash-referenced node must be fetched), the zkVM won't count those hashes against its circuit, under-reporting keccak usage — the same root cause the PR was written to fix.

Consider threading `crypto: &dyn Crypto` into `get_node_checked` (mirroring the treatment of `commit` / `compute_hash_no_alloc`), or adding a comment explaining why native keccak is intentional here (e.g. "only reachable outside the guest-program path").

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/vm/witness_db.rs
Line: 32-68

Comment:
**Redundant `crypto` parameters alongside stored `self.crypto`**

`GuestProgramStateWrapper` stores `self.crypto: Arc<dyn Crypto + Send + Sync>` so that `VmDatabase` trait methods (`get_account_state`, `get_block_hash`, `get_storage_slot`) can use it without receiving it as a parameter. But the four wrapper-specific methods below also accept an explicit `crypto: &dyn Crypto` parameter and just forward it straight through:

```rust
pub fn apply_account_updates(&mut self, ..., crypto: &dyn Crypto) { ... }
pub fn state_trie_root(&self, crypto: &dyn Crypto) { ... }
pub fn get_first_invalid_block_hash(&self, crypto: &dyn Crypto) { ... }
pub fn initialize_block_header_hashes(&self, ..., crypto: &dyn Crypto) { ... }
```

In `execution.rs` every caller passes `crypto.as_ref()` from the same `Arc` that was stored as `self.crypto`, so there is no functional difference. However, the split API creates two risks:

1. A future caller could pass a **different** `crypto` to `apply_account_updates` than the one stored in `self.crypto`. The `account_hashes_by_address` cache inside `GuestProgramState` would then contain entries keyed by one crypto's hash but later queried via the `VmDatabase` path (which always uses `self.crypto`). Since all correct Keccak-256 implementations produce the same output this does not cause a hash mismatch today, but it is a fragile invariant.
2. The call sites in `execution.rs` are harder to read — the reader has to verify that `crypto.as_ref()` is indeed the same object as `self.crypto`.

Consider replacing the explicit parameters with `self.crypto.as_ref()` inside the wrapper methods, removing the ambiguity entirely.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/common/trie/rlp.rs
Line: 18-67

Comment:
**`NativeCrypto` in RLP encoding is safe in the `commit()` path but undocumented edge-case risk**

The PR description correctly notes this is "safe due to memoization": by the time `BranchNode::encode` / `ExtensionNode::encode` run inside `NodeRef::commit()`, all children have already been committed recursively and their hashes are cached in their `OnceLock` fields. The `compute_hash_ref(&NativeCrypto)` calls therefore return cached values and never actually invoke keccak.

However, the same `RLPEncode` impl is also called via `Node::encode_to_vec()` (used directly from `put_batch_no_alloc` in `db.rs`), and potentially from other non-`commit` paths. If `encode()` is invoked on a branch or extension node whose children have **not** yet been committed/memoized, `NativeCrypto::keccak256` is called, the result is stored in the child's `OnceLock`, and any subsequent `commit(injected_crypto)` will find the lock already set and skip the injected crypto for that child.

A defensive guard and/or a `// Safety: children are always memoized before this method is called from commit()` comment would make the invariant explicit and prevent accidental misuse.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: c4b8503

Comment on lines +87 to +88
Ok(node) => (node.compute_hash(&NativeCrypto) == *hash)
.then_some(Ok(Arc::new(node))),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_node_checked still uses NativeCrypto — missed keccak call site

get_node_checked verifies a loaded node's hash with NativeCrypto before returning it. Two consequences:

  1. Performance: if this branch is hit during zkVM execution (e.g. an incomplete witness that needs a DB lookup), the keccak is computed outside the zkVM's circuit accelerator, defeating the purpose of this PR for that node.
  2. Cache poisoning: node.compute_hash(&NativeCrypto) calls memoize_hashes internally and writes the result into the node's OnceLock. If the same NodeRef::Node is later commit()-ed with the injected crypto, the OnceLock is already set (from the NativeCrypto path) and the injected crypto is never used for that node.

Both crypto implementations produce the same Keccak-256 output, so there is no correctness bug in normal operation. But if a properly-formed witness is incomplete (any hash-referenced node must be fetched), the zkVM won't count those hashes against its circuit, under-reporting keccak usage — the same root cause the PR was written to fix.

Consider threading crypto: &dyn Crypto into get_node_checked (mirroring the treatment of commit / compute_hash_no_alloc), or adding a comment explaining why native keccak is intentional here (e.g. "only reachable outside the guest-program path").

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/common/trie/node.rs
Line: 87-88

Comment:
**`get_node_checked` still uses `NativeCrypto` — missed keccak call site**

`get_node_checked` verifies a loaded node's hash with `NativeCrypto` before returning it. Two consequences:

1. **Performance**: if this branch is hit during zkVM execution (e.g. an incomplete witness that needs a DB lookup), the keccak is computed outside the zkVM's circuit accelerator, defeating the purpose of this PR for that node.
2. **Cache poisoning**: `node.compute_hash(&NativeCrypto)` calls `memoize_hashes` internally and writes the result into the node's `OnceLock`. If the same `NodeRef::Node` is later `commit()`-ed with the injected crypto, the `OnceLock` is already set (from the `NativeCrypto` path) and the injected crypto is never used for that node.

Both crypto implementations produce the same Keccak-256 output, so there is **no correctness bug** in normal operation. But if a properly-formed witness is incomplete (any hash-referenced node must be fetched), the zkVM won't count those hashes against its circuit, under-reporting keccak usage — the same root cause the PR was written to fix.

Consider threading `crypto: &dyn Crypto` into `get_node_checked` (mirroring the treatment of `commit` / `compute_hash_no_alloc`), or adding a comment explaining why native keccak is intentional here (e.g. "only reachable outside the guest-program path").

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 32 to 68
@@ -53,8 +61,10 @@ impl GuestProgramStateWrapper {
pub fn initialize_block_header_hashes(
&self,
blocks: &[Block],
crypto: &dyn Crypto,
) -> Result<(), GuestProgramStateError> {
self.lock_mutex()?.initialize_block_header_hashes(blocks)
self.lock_mutex()?
.initialize_block_header_hashes(blocks, crypto)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant crypto parameters alongside stored self.crypto

GuestProgramStateWrapper stores self.crypto: Arc<dyn Crypto + Send + Sync> so that VmDatabase trait methods (get_account_state, get_block_hash, get_storage_slot) can use it without receiving it as a parameter. But the four wrapper-specific methods below also accept an explicit crypto: &dyn Crypto parameter and just forward it straight through:

pub fn apply_account_updates(&mut self, ..., crypto: &dyn Crypto) { ... }
pub fn state_trie_root(&self, crypto: &dyn Crypto) { ... }
pub fn get_first_invalid_block_hash(&self, crypto: &dyn Crypto) { ... }
pub fn initialize_block_header_hashes(&self, ..., crypto: &dyn Crypto) { ... }

In execution.rs every caller passes crypto.as_ref() from the same Arc that was stored as self.crypto, so there is no functional difference. However, the split API creates two risks:

  1. A future caller could pass a different crypto to apply_account_updates than the one stored in self.crypto. The account_hashes_by_address cache inside GuestProgramState would then contain entries keyed by one crypto's hash but later queried via the VmDatabase path (which always uses self.crypto). Since all correct Keccak-256 implementations produce the same output this does not cause a hash mismatch today, but it is a fragile invariant.
  2. The call sites in execution.rs are harder to read — the reader has to verify that crypto.as_ref() is indeed the same object as self.crypto.

Consider replacing the explicit parameters with self.crypto.as_ref() inside the wrapper methods, removing the ambiguity entirely.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/vm/witness_db.rs
Line: 32-68

Comment:
**Redundant `crypto` parameters alongside stored `self.crypto`**

`GuestProgramStateWrapper` stores `self.crypto: Arc<dyn Crypto + Send + Sync>` so that `VmDatabase` trait methods (`get_account_state`, `get_block_hash`, `get_storage_slot`) can use it without receiving it as a parameter. But the four wrapper-specific methods below also accept an explicit `crypto: &dyn Crypto` parameter and just forward it straight through:

```rust
pub fn apply_account_updates(&mut self, ..., crypto: &dyn Crypto) { ... }
pub fn state_trie_root(&self, crypto: &dyn Crypto) { ... }
pub fn get_first_invalid_block_hash(&self, crypto: &dyn Crypto) { ... }
pub fn initialize_block_header_hashes(&self, ..., crypto: &dyn Crypto) { ... }
```

In `execution.rs` every caller passes `crypto.as_ref()` from the same `Arc` that was stored as `self.crypto`, so there is no functional difference. However, the split API creates two risks:

1. A future caller could pass a **different** `crypto` to `apply_account_updates` than the one stored in `self.crypto`. The `account_hashes_by_address` cache inside `GuestProgramState` would then contain entries keyed by one crypto's hash but later queried via the `VmDatabase` path (which always uses `self.crypto`). Since all correct Keccak-256 implementations produce the same output this does not cause a hash mismatch today, but it is a fragile invariant.
2. The call sites in `execution.rs` are harder to read — the reader has to verify that `crypto.as_ref()` is indeed the same object as `self.crypto`.

Consider replacing the explicit parameters with `self.crypto.as_ref()` inside the wrapper methods, removing the ambiguity entirely.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 18 to 67
@@ -35,15 +37,15 @@ impl RLPEncode for BranchNode {
fn encode_to_vec(&self) -> Vec<u8> {
let value_len = <[u8] as RLPEncode>::length(&self.value);
let choices_len = self.choices.iter().fold(0, |acc, child| {
acc + RLPEncode::length(child.compute_hash_ref())
acc + RLPEncode::length(child.compute_hash_ref(&NativeCrypto))
});
let payload_len = choices_len + value_len;

let mut buf: Vec<u8> = Vec::with_capacity(payload_len + 3); // 3 byte prefix headroom

encode_length(payload_len, &mut buf);
for child in self.choices.iter() {
match child.compute_hash_ref() {
match child.compute_hash_ref(&NativeCrypto) {
NodeHash::Hashed(hash) => hash.0.encode(&mut buf),
NodeHash::Inline((_, 0)) => buf.push(RLP_NULL),
NodeHash::Inline((encoded, len)) => {
@@ -60,7 +62,7 @@ impl RLPEncode for BranchNode {
impl RLPEncode for ExtensionNode {
fn encode(&self, buf: &mut dyn bytes::BufMut) {
let mut encoder = Encoder::new(buf).encode_bytes(&self.prefix.encode_compact());
encoder = self.child.compute_hash().encode(encoder);
encoder = self.child.compute_hash(&NativeCrypto).encode(encoder);
encoder.finish();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NativeCrypto in RLP encoding is safe in the commit() path but undocumented edge-case risk

The PR description correctly notes this is "safe due to memoization": by the time BranchNode::encode / ExtensionNode::encode run inside NodeRef::commit(), all children have already been committed recursively and their hashes are cached in their OnceLock fields. The compute_hash_ref(&NativeCrypto) calls therefore return cached values and never actually invoke keccak.

However, the same RLPEncode impl is also called via Node::encode_to_vec() (used directly from put_batch_no_alloc in db.rs), and potentially from other non-commit paths. If encode() is invoked on a branch or extension node whose children have not yet been committed/memoized, NativeCrypto::keccak256 is called, the result is stored in the child's OnceLock, and any subsequent commit(injected_crypto) will find the lock already set and skip the injected crypto for that child.

A defensive guard and/or a // Safety: children are always memoized before this method is called from commit() comment would make the invariant explicit and prevent accidental misuse.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/common/trie/rlp.rs
Line: 18-67

Comment:
**`NativeCrypto` in RLP encoding is safe in the `commit()` path but undocumented edge-case risk**

The PR description correctly notes this is "safe due to memoization": by the time `BranchNode::encode` / `ExtensionNode::encode` run inside `NodeRef::commit()`, all children have already been committed recursively and their hashes are cached in their `OnceLock` fields. The `compute_hash_ref(&NativeCrypto)` calls therefore return cached values and never actually invoke keccak.

However, the same `RLPEncode` impl is also called via `Node::encode_to_vec()` (used directly from `put_batch_no_alloc` in `db.rs`), and potentially from other non-`commit` paths. If `encode()` is invoked on a branch or extension node whose children have **not** yet been committed/memoized, `NativeCrypto::keccak256` is called, the result is stored in the child's `OnceLock`, and any subsequent `commit(injected_crypto)` will find the lock already set and skip the injected crypto for that child.

A defensive guard and/or a `// Safety: children are always memoized before this method is called from commit()` comment would make the invariant explicit and prevent accidental misuse.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants