Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide if we want to merklize state #647

Open
kayabaNerve opened this issue Jan 17, 2025 · 1 comment
Open

Decide if we want to merklize state #647

kayabaNerve opened this issue Jan 17, 2025 · 1 comment
Labels
discussion This requires discussion node

Comments

@kayabaNerve
Copy link
Member

Merklizing state:

  1. Enshrines a storage layout
  2. Adds notable overhead to IO

In #330, I opened the disscussion on client diversity. An enshrined storage layout requires the storage be matched exactly even if more efficient algorithms exist. I'll give the example of how the protocol expects to calculate a median over a window, and how the runtime uses the database to do so. Instead of reading every value, removing with shift, inserting, finding the median via selecting the middle entry, and saving the new set of values, we just save the storage key for the median. We then use next key/prev key (DB iteration fns) to move the median as necessary. It'd be better to cache that across blocks and simply do the expensive median finding once at boot. We only don't because this solution is optimal for the box Substrate gives us. If enshrined, any and all implementations have to use this methodology.

In #379, I express the intent to only merklize events due to this.

It should be noted however that Substrate offers "warp sync", where the latest finalized state is downloades but the transitions aren't validated. This introduces a supermajority trust assumption but users:

  1. Can check they're on the same chain as everyone else, which presumably was validated by everyone else
  2. Validate all future state transitions

So long as at least one node was honest, and if their node rejects a finalized block, raise the alarm in time for future initial syncers to be aware, this is fine. This requires state be merklized however.

We can define events as having sufficient data to rebuild the state, but this would require careful planning and that we build a custom state sync for warp sync. Each event would need to contain an account's new balance, not the amount transferred, and we'd still need to merklize the last time an account was updated. This also handwaves the entire state as solely balances.

Warp sync does greatly reduce the time to sync, but part of why syncing is so long is because of state merklization and once synced, a node still has to pay the overhead of merklization. By having a larger initial sync than warp sync, but a shorter initial sync than we would with state merklization, we'd achieve greater performance while running the node.

There are 'next-generation' Merkle DBs though. A great blog post on the techniques is https://sovereign.mirror.xyz/jfx_cJ_15saejG9ZuQWjnGnG-NfahbazQH98i1J3NN8. #385 considered using Avalanche's Firewood, resolving as not eligible due to it not being FOSS. We don't have an issue for Nomad, which isn't published yet, nor NOMT, but there's already commentary on it re: Substrate.

One concern with adopting such a DB is that they'll have distinct wire formats. Modifying them to be compatible may be infeasible. Reimplementing their wire format, but without all of their optimizations, would achieve diversity by removing the performance gains.

Please note we have discussed diversity of the current schema in #385 (Firewood was mentioned for diversity, not to solve performance constraints), #403 discussed reth's, not because it's so hyper-optimized, but because it's presumably well reviewed and rock solid, and #386 parity-db. parity-db is notable as Substrate ships with RocksDB and parity-db backends, so it's already diverse without us needing to do more work.

There's also relevance in this discussion to paritytech/polkadot-sdk#278. If we remove merklization, except for events, we can do the merklization for events in memory and not worry about the cleanup cost as they'd never so entangle the DB.

Finally, I'll note merklized state exists for light clients. Warp sync can be viewed as this where you can download every entry without actually incurring the transitions. It's also necessary for Polkadot due to their parachain design. Serai isn't a parachain/L2, and should have minimal light client use-cases? We don't offer a VM to do SPV bridging with? So long as the next set of validators is in an event, following consensus should be possible? Verifying transfers were made, if that's ever desired, is still possible with just an event stream? It's current account balances which would be unaccessible (unless we also include that in the event).

I lean towards only publishing a commitment to the events, making Serai ineligible for warp sync. I don't want to commit to an entire storage, and tree, schema at this time. The loss of warp sync is unfortunate but acceptable. We can also look at restoring it via ad-hoc having validators sign state roots, for as long as a supermajority of validators run a merklized DB even though it isn't required by the protocol, and publishing those signatures out-of-band to consensus itself. This scheme works even without constantly merklized state if regular exports are made and so signed.

@kayabaNerve kayabaNerve added discussion This requires discussion node labels Jan 17, 2025
@kayabaNerve
Copy link
Member Author

Please note a decision not to will only remove it from the protocol, not the implementation, initially. I'm not accepting the scope of the latter at this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion This requires discussion node
Projects
None yet
Development

No branches or pull requests

1 participant