Scroll `StateRootProvider` implementation #4

frisitano · 2024-10-01T08:53:17Z

Describe the feature

Overview

Scroll uses a binary Markle Patricia trie (BMPT) to represent it's state. This is different to the standard MPT that ethereum and reth uses. As such we will need to introduce a new implementation of the DatabaseStateRoot to support scrolls state model. This will also involve modifications to associated types and hashing schemes.

Design

There are a number of data structures, components and algorithms that are relevant in the context of the state root calculation. We will document those which require modification to support the BMPT state root calculation.

Introduction of `StateTypes`

To enable configurability of types required for modified state root calculation we will follow a similar pattern to what is used for EngineTypes, as such we will introduce a trait for StateTypes as follows:

pub trait StateTypes {
    /// The state root type.
    type StateRoot<'a, TX: DbTx + 'a>: DatabaseStateRoot<'a, TX>;
    /// The storage root type.
    type StorageRoot<'a, TX: DbTx + 'a>: DatabaseStorageRoot<'a, TX>;
    /// The state proof type.
    type StateProof<'a, TX: DbTx + 'a>: DatabaseProof<'a, TX>;
    /// The state witness type.
    type StateWitness<'a, TX: DbTx + 'a>: DatabaseTrieWitness<'a, TX>;
    /// The key hasher type.
    type KeyHasher: reth_trie::KeyHasher;
}

The StateKeyHasher trait can be defined as follows:

/// Trait for hashing state keys.
pub trait StateKeyHasher: Clone + Sync + Send + Default {
    /// Hashes the given bytes into a 256-bit hash.
    fn hash_key<T: AsRef<[u8]>>(bytes: T) -> B256;
}

As example of how this would be implemented for ethereum would be:

pub struct EthereumStateTypes;

impl StateTypes for EthereumStateTypes {
    type StateRoot = StateRoot;
    ...
    type StorageRoot = StorageRoot;
    type KeyHasher = KeccakKeyHasher;
}

Where the StateRoot and StorageRoot correspond to the types linked and KeccakKeyHasher is a type that will use the default keccak256 function from alloy.

We can then integrate this with NodeTypes by introducing a NodeTypesWithState trait and implementing it following the same pattern used for Engine types as seen here. This pattern will allow for configurability of the state root calculation by specifying the appropriate StateTypes object when configuring the node.

`HashedPostState` and `HashedStorage`

The HashedPostState and HashedStorage store the state diff associated with a block (or a number of blocks in the case of pipelining). By default both of these data structures leverage keccak256 to hash keys associated with accounts and storage slots. I propose that we introduce a generic on these data structures for key hashing.

We will make the following methods on HashedPostState and HashedStorage generic over the StateKeyHasher as follows:

HashedPostState::from_bundle_state<'a, K: StateKeyHasher>(...) -> Self
HashedPostState::from_cache_state<'a, KH: StateKeyHasher>(...) -> Self
HashedStorage::from_plain_storage<'a, KH: StateKeyHasher>(...) -> Self

Providers

The StateRootProvider and the StorageRootProvider will be made generic over the StateTypes trait. Parent types will also be impacted and appropriate changes will need to be made.

`ParallelStateRoot`

The ParallelStateRoot type leverages the StorageRoot and has an implementation of the StateRoot. The storage roots are computed in parallel and then integrated into the state root calculation. I suggest we modify the DatabaseStateRoot interface to allow for pre-computed storage roots to be passed in. As such we remove the bespoke storage root implementation in the ParallelStateRoot object and it just becomes a driver for parallel storage root and state root computation.

`BranchNodeCompact`

We should be able to reuse the BranchNodeCompact data structure to store our merkle nodes in the database. As scroll uses a binary merkle trie structure only two children would be expected in a branch node.

Stack based BMPT root algorithm

We may want to implement a stack based bmpt root algorithm that follows a similar pattern to what reth natively does for the standard Patricia Merkle trie. More research is required to understand the implications here.

The text was updated successfully, but these errors were encountered:

Thegaram · 2024-10-03T09:36:54Z

As I recall you mentioned another alternative ("deep integration with mdbx") that we might want to do in the future. Could you add some more context about that here?

clabby · 2024-10-09T15:33:14Z

This is really cool! Hope to see some of the abstractions that come out of this work land upstream for other extensions of reth to use.

Have you guys looked into using liburkel here? It could potentially be a nice alternative to MDBX when working with base2 radix tries.

frisitano · 2024-10-10T07:08:56Z

This is really cool! Hope to see some of the abstractions that come out of this work land upstream for other extensions of reth to use.

For sure, thats the plan! It would be great to get a review / feedback from potential consumers of the abstractions as development progresses. I can keep you updated if that's ok with you?

Have you guys looked into using liburkel here? It could potentially be a nice alternative to MDBX when working with base2 radix tries.

I've not looked into it but it certainly seems interesting! There's also some promising research being done with nomt (blog post), it would be good to explore potential integrations with reth and benchmark to compare performance.

frisitano · 2024-10-20T08:39:06Z

Stack-Based BMPT State Root Calculation

Our ambition is to provide an implementation of scrolls BMPT within reth. We intend to leverage as much pre-existing architecture from scrolls default MPT implementation as possible to minimize the scope of this change. Leveraging the pre-existing reth types will result in a less efficient state size however the simplicity and practicality of this approach outweigh the cons.

`HashBuilder`

The algorithm that is used to build the state root and associated TrieUpdates is located within the HashBuilder object in alloy-trie.

We need to implement an analogous object with an analogous interface that implements an algorithm that builds a BMPT as defined in the scroll zk_trie spec here. We can name this object BmptHashBuilder or something similar to represent that the object builds a binary Merkle Patricia trie.

`StateRoot`

reth/crates/trie/trie/src/trie.rs

Lines 21 to 37 in 4b1567a

    
           /// `StateRoot` is used to compute the root node of a state trie. 
        
           #[derive(Debug)] 
        
           pub struct StateRoot<T, H> { 
        
               /// The factory for trie cursors. 
        
               pub trie_cursor_factory: T, 
        
               /// The factory for hashed cursors. 
        
               pub hashed_cursor_factory: H, 
        
               /// A set of prefix sets that have changed. 
        
               pub prefix_sets: TriePrefixSets, 
        
               /// Previous intermediate state. 
        
               previous_state: Option<IntermediateStateRootState>, 
        
               /// The number of updates after which the intermediate progress should be returned. 
        
               threshold: u64, 
        
               #[cfg(feature = "metrics")] 
        
               /// State root metrics. 
        
               metrics: StateRootMetrics, 
        
           }

We can leverage the pre-existing StateRoot object which is responsible for iterating over nodes and providing them to the HashBuilder for state root calculation. In the first instance, we should create a code copy of the objects but in the future, we can consider introducing abstractions by the introduction of a generic over StateCommitmentHashBuilder trait defined below on the upstream StateRoot object.

`StorageRoot`

reth/crates/trie/trie/src/trie.rs

Lines 285 to 299 in 4b1567a

    
           /// `StorageRoot` is used to compute the root node of an account storage trie. 
        
           #[derive(Debug)] 
        
           pub struct StorageRoot<T, H> { 
        
               /// A reference to the database transaction. 
        
               pub trie_cursor_factory: T, 
        
               /// The factory for hashed cursors. 
        
               pub hashed_cursor_factory: H, 
        
               /// The hashed address of an account. 
        
               pub hashed_address: B256, 
        
               /// The set of storage slot prefixes that have changed. 
        
               pub prefix_set: PrefixSet, 
        
               /// Storage root metrics. 
        
               #[cfg(feature = "metrics")] 
        
               metrics: TrieRootMetrics, 
        
           }

We will take a similar approach as described above with the StateRoot and then in the future look to introduce the StateCommitmentHashBuilder abstraction.

`StateCommitmentHashBuilder`

The logic for constructing state commitments is encapsulated within a single object. Currently in reth it is in the HashBuilder referenced above. With the introduction of scroll's version of the BmptHashBuilder we should consider introducing a trait for this object such as the following:

pub trait StateCommitmentHashBuilder: Default {
    fn add_branch(&mut self, key: Nibbles, value: B256, children_are_in_trie: bool);
    fn add_leaf(&mut self, key: Nibbles, value: &[u8]);
    fn root(&self) -> B256;
    fn updates_len(&self) -> usize;
    fn split(self) -> (Self, StorageTrieUpdates);
}

This will allow us to make parent objects generic over this trait:

StateRoot
StorageRoot
TrieWitness
Proof

Opened as a separate issue: #24

frisitano added this to the Milestone 1 milestone Oct 1, 2024

frisitano changed the title ~~Scroll StateRootProvider~~ Scroll StateRootProvider implementation Oct 1, 2024

frisitano mentioned this issue Oct 8, 2024

Milestone 1 #16

Open

4 tasks

frisitano self-assigned this Oct 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scroll `StateRootProvider` implementation #4

Scroll `StateRootProvider` implementation #4

frisitano commented Oct 1, 2024 •

edited

Loading

Thegaram commented Oct 3, 2024

clabby commented Oct 9, 2024

frisitano commented Oct 10, 2024

frisitano commented Oct 20, 2024 •

edited

Loading

Scroll StateRootProvider implementation #4

Scroll StateRootProvider implementation #4

Comments

frisitano commented Oct 1, 2024 • edited Loading

Describe the feature

Overview

Design

Introduction of StateTypes

HashedPostState and HashedStorage

Providers

ParallelStateRoot

BranchNodeCompact

Stack based BMPT root algorithm

Thegaram commented Oct 3, 2024

clabby commented Oct 9, 2024

frisitano commented Oct 10, 2024

frisitano commented Oct 20, 2024 • edited Loading

Stack-Based BMPT State Root Calculation

HashBuilder

StateRoot

StorageRoot

StateCommitmentHashBuilder

Scroll `StateRootProvider` implementation #4

Scroll `StateRootProvider` implementation #4

frisitano commented Oct 1, 2024 •

edited

Loading

Introduction of `StateTypes`

`HashedPostState` and `HashedStorage`

`ParallelStateRoot`

`BranchNodeCompact`

frisitano commented Oct 20, 2024 •

edited

Loading

`HashBuilder`

`StateRoot`

`StorageRoot`

`StateCommitmentHashBuilder`