Refactor deneb networking #4561

realbigsean · 2023-08-01T23:03:55Z

Issue Addressed

The goals here are:

Reduce code duplication between blocks and blobs - the logic here is very complicated so IMO reducing duplication is worth adding some generics/type complexity
Increase type safety - in the first iteration of single lookups I ran into bugs accessing/mutating the wrong lookup state (block or blob) for the given response. This PR makes it a lot easier to avoid those types of mistakes. It also may lay the groundwork for giving us type safety across RPC requests and responses.

Implementation

Made single block and blob lookups generic by introducing a RequestState trait. This trait has an associated type ResponseType that gives us some degree of type safety when deciding what state in the lookup to mutate for a given response.
I tried to consolidate some of the error handling, to make it easier to follow exactly where we drop requests, log, and peer score.
Updated RequestId, this is the new one:

pub enum RequestId {
    SingleBlock { id: SingleLookupReqId },
...
}

pub struct SingleLookupReqId {
    pub id: Id,
    pub req_counter: Id,
}

A pain point I ran into previously with blobs was around keying lookups based on an Id, which is generated per-request. Since a single lookup now really contains two request, each with independent retry logic, keying by a single id, that either request could update didn't really make sense. I found it difficult to work with and reason about.

In this PR I decided to make a Id map to an entire lookup, so shared across block and blobs, and unchanged for the life of the lookup. This led me to understand why updating the request Id on retry was previously implemented. When we are verifying a response, we don't wait for the stream terminator to peer score and retry, so it's possible we retry the request while we are still downloading responses for the old request, this could cause us to interpret the old response as the new one if we don't update the Id, or have some way of differentiating old and new. This is why I added req_counter, to SingleLookupReqId. The lookup remains keyed solely on the Id and shared between blocks and blobs, but the req_counter can be tracked separately for blocks and blobs and used to filter out old responses.

Not included, but maybe future

I went down a small rabbit hole making RequestId into a superstruct. This would let us send per-protocol types as Ids instead of the RequestId enum when we make any RPC request. In by-range requests, this then allows us to key all the maps in SyncNetworkContext by different types for example. In by-roots requests, we could add a RequestId associated type to RequestState.

This reverts commit 405e95b.

…bigsean/lighthouse into refactor-deneb-networking

…into refactor-deneb-networking

…ogic

AgeManning

I mainly had a look at the new trait you have implemented.

I like it! I think its good to try and group a bunch of the logic into one place and this seems like a very nice way to do it.

There are a bunch of changes in the deneb branch which make it hard to review a lot of the other changes to the actual sync code.

Also, from experience, with these extensive changes to sync, it's very hard to reason about all the billion edge cases in sync during a code review. I found in the past the best way to find bugs is to battle test the #$!@ out of it.

I'd propose merging and doing all kinds of funky things on devnets/testnets and we can bombard it on attacknets, when we think it's ready.

Nice work tho, looks really good to me!

beacon_node/network/src/sync/block_lookups/common.rs

beacon_node/network/src/router.rs

beacon_node/network/src/sync/block_lookups/tests.rs

beacon_node/network/src/sync/block_lookups/mod.rs

beacon_node/network/src/sync/manager.rs

jimmygchen · 2023-08-02T07:02:47Z

I haven't finished reviewing yet, but the refactoring looks really good so far! 👏

beacon_node/network/src/sync/block_lookups/single_block_lookup.rs

Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>

This reverts commit 064bf64.

realbigsean · 2023-08-03T15:08:39Z

Thanks for the reviews guys, think I've addressed everything so far. Here's the diff:
https://github.com/sigp/lighthouse/compare/c71e011a1cc4482c2d31b0d37481e5b4b6dc2ac2..af5e0d455c3699de979729bb6746254b8e06894b

…into refactor-deneb-networking

beacon_node/network/src/sync/block_lookups/mod.rs

jimmygchen · 2023-08-04T04:40:13Z

beacon_node/network/src/sync/block_lookups/mod.rs

+                self.handle_verified_response::<Current, R>(
+                    seen_timestamp,
+                    cx,
+                    BlockProcessType::SingleBlock { id: lookup.id },


Do this need to be generic, or is this always a SingleBlock process type here?

This is pretty confusing, but it should always be SingleBlock. The reason is that it's used in scenarios where we're waiting for the entire block + all blobs to arrive before sending to processing.

we could potentially make this less confusing by added a new variant to BlockProcessType that is specific to when we have all blocks and blobs

beacon_node/network/src/sync/block_lookups/mod.rs

beacon_node/network/src/sync/block_lookups/single_block_lookup.rs

beacon_node/network/src/sync/block_lookups/mod.rs

beacon_node/network/src/sync/block_lookups/single_block_lookup.rs

beacon_node/network/src/sync/block_lookups/mod.rs

jimmygchen · 2023-08-04T05:48:30Z

Added a few more comments / questions, overall looks great to me - I've gone through most of the changes I think, and don't fully understand everything yet, but keen to test this out on devnets!

Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>

…gsean/lighthouse into refactor-deneb-networking

pawanjay176

I really like this new approach. I originally thought this would make the code more complex, but this is way better than before. Great work!
Like age said, will try to break this on testnets.

jimmygchen

LGTM 👍

realbigsean added 10 commits July 14, 2023 16:23

Revert "fix merge"

b2246c6

This reverts commit 405e95b.

refactor deneb block processing

b19883e

cargo fmt

e3ee0c6

Merge branch 'merge-unstable-deneb-jul-14' of https://github.com/real…

985bbc5

…bigsean/lighthouse into refactor-deneb-networking

make block and blob single lookups generic

8a6e8d5

Merge branch 'deneb-free-blobs' of https://github.com/sigp/lighthouse …

c2d8ac0

…into refactor-deneb-networking

Merge branch 'deneb-free-blobs' of https://github.com/sigp/lighthouse …

fd0fd3d

…into refactor-deneb-networking

get tests compiling

e27161e

clean up everything add child component, fix peer scoring and retry l…

6bcfaf4

…ogic

smol cleanup and a bugfix

4215160

realbigsean added ready-for-review The code is ready for review deneb labels Aug 1, 2023

remove ParentLookupReqId

c71e011

AgeManning approved these changes Aug 2, 2023

View reviewed changes

beacon_node/network/src/sync/block_lookups/common.rs Outdated Show resolved Hide resolved

jimmygchen reviewed Aug 2, 2023

View reviewed changes

beacon_node/network/src/router.rs Outdated Show resolved Hide resolved

jimmygchen reviewed Aug 2, 2023

View reviewed changes

beacon_node/network/src/sync/block_lookups/tests.rs Outdated Show resolved Hide resolved

jimmygchen reviewed Aug 2, 2023

View reviewed changes

beacon_node/network/src/sync/block_lookups/mod.rs Outdated Show resolved Hide resolved

jimmygchen reviewed Aug 2, 2023

View reviewed changes

beacon_node/network/src/sync/manager.rs Outdated Show resolved Hide resolved

jimmygchen reviewed Aug 2, 2023

View reviewed changes

beacon_node/network/src/sync/manager.rs Outdated Show resolved Hide resolved

jimmygchen reviewed Aug 3, 2023

View reviewed changes

beacon_node/network/src/sync/block_lookups/single_block_lookup.rs Show resolved Hide resolved

jimmygchen reviewed Aug 3, 2023

View reviewed changes

beacon_node/network/src/sync/block_lookups/single_block_lookup.rs Outdated Show resolved Hide resolved

realbigsean and others added 6 commits August 3, 2023 09:43

Update beacon_node/network/src/sync/manager.rs

122def5

Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>

Update beacon_node/network/src/sync/manager.rs

5736891

Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>

update unreachables to crits

064bf64

Revert "update unreachables to crits"

460f712

This reverts commit 064bf64.

update make request/build request to make more sense

74373a9

pr feedback

af5e0d4

Merge branch 'deneb-free-blobs' of https://github.com/sigp/lighthouse …

ba1dc38

…into refactor-deneb-networking