Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EIP: eth/70 - Sharded Blocks Protocol #9004

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions EIPS/eip-7801.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
eip: 7801
title: eth/70 - Sharded Blocks Protocol
description: Replaces block range with a bitlist representing 1-million-block spans in the handshake, with probabilistic shard retention
author: Ahmad Bitar (@smartprogrammer93) <smartprogrammer@windowslive.com>, Giulio Rebuffo (@Giulio2002)
discussions-to: https://ethereum-magicians.org/t/eip-7801-eth-70-sharded-blocks-protocol/21507
status: Draft
type: Standards Track
category: Networking
created: 2024-10-30
requires: 7642
---

## Abstract

This EIP introduces a method enabling an Ethereum node to communicate its available block spans via a bitlist, where each bit represents a 1-million-block span. Nodes use this bitlist to signal which spans of historical data they store, enabling peers to make informed decisions about data availability when requesting blocks. This aims to improve network efficiency by providing a probabilistic snapshot of data locality across nodes.

The proposal extends the Ethereum wire protocol (`eth`) with version `eth/70`, introducing a `blockBitlist` field in the handshake. Nodes probabilistically retain certain block spans to support data locality across the network.

## Motivation

With [EIP-4444](./eip-4444.md), nodes may begin pruning historical data while others continue to serve it. The current approach of connecting and requesting specific blocks to determine data availability is inefficient, consuming unnecessary bandwidth. This EIP addresses this by requiring nodes to retain at least one 1-million-block shard and retain additional shards probabilistically, enabling better-informed and targeted sync processes.
Giulio2002 marked this conversation as resolved.
Show resolved Hide resolved

By using a bitlist in place of a range, nodes provide a snapshot of data locality across historical epochs, supporting efficient data requests.

## Specification

- Advertise a new `eth` protocol capability (version) at `eth/70`.
- The existing `eth/69` protocol will continue alongside `eth/70` until sufficient adoption.
- Modify the `Status (0x00)` message for `eth/70` to add a `blockBitlist` field after the `forkid`:
- Current packet for `eth/69`: `[version: P, networkid: P, blockhash: B_32, genesis: B_32, forkid]`
- New packet for `eth/70`: `[version: P, networkid: P, blockhash: B_32, genesis: B_32, forkid, blockBitlist]`,
where `blockBitlist` is a bitlist, with each bit representing a 1-million-block span.

- Define node behavior based on the bitlist as follows:
- **Bitlist Initialization**: Upon startup, nodes **MUST** set at least one bit in the `blockBitlist` to `on` and backfill that 1-million-block span to ensure data availability for that shard.
- **Shard Retention Probability**: Nodes **SHOULD** probabilistically retain new block spans, with an `M%` probability (e.g., 5–10%) to maintain network data locality and avoid frequent pruning of new ranges.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good with some background for the probability calculation.

Could a simpler alternative to this be, "Nodes must retain the current active shard plus one/two shard(s) immediately preceding the active one".
The idea being that every shard being retained, gives a set amount of time for fresh syncing nodes to pick that range in "Bitlist Initialization". The question would then be how much time/shards would be sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will re-update shortly, ran some numbers and distribution is good enough

- **Active Shard Maintenance**: Nodes **MUST** retain all blocks from shards that are still actively being created to support synchronization.

Upon connecting using `eth/70`, nodes exchange the `Status` message, including the `blockBitlist`. This single handshake message, which includes the bitlist, eliminates the need for additional message types.

Nodes should maintain connections regardless of a peer’s bitlist state, except when peer slots are full. In this case, nodes may prioritize connections with peers that serve more relevant shards.

If a node is interested in querying a specific block span, it can use the bitlist to determine which peers are more likely to have the data. If the peer they are connected to does not have the data, they can disconnect and connect to another peer, until they find what they have been looking for.

### ENR variation

This EIP could be implemented without a new protocol version by adding a new field to the Ethereum Node Record (ENR) to advertise the bitlist. This would allow nodes to advertise their available block spans without requiring a handshake. However, this approach would not provide the same level of assurance as the handshake, as the ENR is not authenticated. Also, it would make the discovery process much simpler, as nodes just would need to take a look at the ENR to determine if a peer has the data they are looking for.

## Rationale

The bitlist approach provides a flexible means to represent and retain block data with support for data locality, especially as nodes probabilistically retain certain shards. This mechanism aligns with the pruning proposed in EIP-4444, while ensuring that historical data spans remain available across the network.

The bitlist approach is already used in the Consensus Layer for attestation subnets, making it a familiar and efficient method for representing data spans. Additionally, communicating shards this way is broadly used in many distributed systems, making it a natural fit for Ethereum.

## Backwards Compatibility

This EIP introduces `eth/70`, extending the handshake in a backward-incompatible manner. However, `devp2p` allows concurrent versions of the same protocol, so nodes can continue using older protocol versions (e.g., `eth/69`, `eth/68`, `eth/67`) if they lack support for `eth/70`.

This EIP does not affect the consensus engine or require a hard fork.

## Security Considerations

There are some considerations:

- The size of the network is not taken into account, so for an extended history, shard sizes may need to increase. However, this is not a major issue for Ethereum, as many nodes are active on the network at all times.
- The more blocks in the network, the more shards there will be, potentially diluting peers across multiple shards. This could make querying specific shards more challenging in the future. However, with sufficiently large shard sizes, this should not pose a significant problem. For instance, if Ethereum has 200,000,000 blocks and a shard size of 1,000,000 blocks, and a node has 32 peers, then the probability that a peer has a specific shard is around 15%, which is manageable.

## Copyright

This document is CC0-licensed; rights are waived through [CC0](../LICENSE.md).
Loading