Specify Client Versions on Engine API #517

ethDreamer · 2024-01-26T09:18:41Z

By analyzing the structure of beacon blocks on the network, we are able to obtain fairly accurate data on consensus layer client diversity. Unfortunately, do to the fact that the overwhelming majority of validators use mev-boost, their execution clients do not leave any fingerprint behind in block proposals. Thus we are forced to rely on limited self-reporting data from staking pools. Many pools do not participate, and we often have outdated statistics for the pools that do. Worse yet, we have no data on client diversity for home stakers.

This PR can change that by allowing consensus clients to learn which execution client they are connected with.

Consensus clients can then embed this in their graffiti field by default when the user doesn't bother to set it. A quick survey of recent proposal graffiti reveals that:

already embed their client and version by default. It would be great to add the execution client to this. Perhaps prysm could be convinced to join as well.

An analysis of ~2000 recent blocks indicated that nearly half of all validators don't bother to change their graffiti from the default so the potential to gather data here is huge.

dapplion

While this can be faked, it's a strict improvement over status quo with no downsides, so I support

jflo

This would be easy to implement, and improve a clear and present problem on the network.

lightclient

Generally support this. I don't think we should expose under the provided name though. I would rather expose under engine_*. For one, the web3_* namespace isn't defined anywhere in this repo. Second, if we account for the possibilities of specialized engine server (thinking like a client multiplexer) then this response is extremely engine oriented.

If that sounds reasonable, can you update this PR with the name and add the schema for the method in both the openrpc spec and in engine common spec? It should look similar to engine_exchangeCapabilities I think.

garyschulte · 2024-01-26T16:47:51Z

Adding this to the engine api is definitely an improvement. In order to fit within the graffiti, we should specify a field size limit or a strategy to encode the version info.

Otherwise we might end up with responses like: Lighthouse/v4.5.0-441fc16 :besu/v24.1.2-dev-8407b9e7/linux-x86_64/openjdk-java-21

fjl · 2024-01-27T18:32:20Z

I'm personally more in favor of standardizing web3_clientVersion, because it already exists in clients.

ethDreamer · 2024-01-29T05:38:09Z

In light of comments received so far I've pushed an alternative specification called engine_clientVersionV1 which is more comprehensive. There are a couple things to be decided:

Do we reuse web3_clientVersion or choose the more comprehensive engine_clientVersionV1?
Should we require this method be supported instead of recommending its support? (SHOULD vs MUST)
Do we agree on the abbreviations for the ClientCode?
Do we accept using the first 4 bytes of the commit hash as a short-hand for version?

Personally I lean towards taking engine_clientVersionV1 and making it mandatory if there aren't objections. If we're not going to take the easy route and just reuse web3_clientVersion then we might as well make a method that accomplishes what we're really trying to do here (get better measurements of EL client diversity). To that end, we require a standardized shorthand for specifying both clients in the limited space of block graffiti (32 bytes).

The definitions of ClientCode specified here allow us to standardize how we encode any client pairing and specify both execution and consensus client versions inside the block graffiti within just 20 bytes. For example:

LH1be52536BU0f91a674

If desired, the space could be further reduced so that the bytes of the commit hash are embedded directly into the graffiti bytes (allowing the full consensus and execution client versions to be specified in 12 bytes).

Standardizing the version specifications this way makes graffiti analysis easy regardless of what client pairs are used. Based on my testing, geth, nethermind, and besu already have the commit hash embedded in their binaries when they list the client version so it shouldn't be difficult for them to build it in here. I can't speak for the other clients though.

Side note: by design, none of the proposed client codes are valid hex so they they won't be confused with the commit hash.

dapplion

Standarizing the version format return is a great improvement. I would switch to name to "Client Version" engine_clientVersion instead of using the term identification. Same meaning but better memetics

michaelsproul

I like the new dedicated method, and agree with Lion that it should be called engine_clientVersion.

I'm in favour of making it mandatory after an appropriate adoption period.

src/engine/identification.md

Also specify Grandine abbreviation and accomodate other versioning systems.

rkrasiuk · 2024-01-29T09:45:34Z

Very supportive of this change. Agree with previous comments around naming. Unsure if we should introduce versioning for identification. we already have unversioned engine_exchangeCapabilities, might as well drop v1 and simply have engine_clientVersion

ethDreamer · 2024-01-29T09:54:12Z

Unsure if we should introduce versioning for identification. we already have unversioned engine_exchangeCapabilities, might as well drop v1 and simply have engine_clientVersion

I believe engine_exchangeCapabilities is the only unversioned method because we won't allow it to ever change. If execution clients began supporting a new version of exchangeCapabilities, then consensus clients would have to do trial and error to determine which version to call, which defeats the purpose of having a "what methods do you support" method.

This doesn't necessarily mean that we couldn't also agree to never allow the engine_clientVersion method to change.

StefanBratanov · 2024-01-29T12:34:31Z

Supportive of this change. Maybe the method could be renamed to engine_exchangeClientVersion similar to engine_exchangeCapabilities since we are essentially exchanging the versions.

rubo

I support this proposal.

Not as a part of this one, but when it comes to versions, I'd also like to have a standardized version format for clients. Something like what browsers have for their user agent string defined in RFC 9110. That would be helpful for the network stats handling as well. Currently, we mostly have name/version/platform/lang, but it slightly varies from client to client.

lightclient · 2024-01-29T19:05:27Z

Is there an advantage of being prescriptive about the format the client returns it's version in? I was imagining something much closer to web3_clientVersion where any value would be accepted by CL and incorporated into the graffiti.

I guess there isn't much downside as it is mainly just upfront cost of spec'ing things out.

garyschulte · 2024-01-29T19:38:22Z

Is there an advantage of being prescriptive about the format the client returns it's version in? I was imagining something much closer to web3_clientVersion where any value would be accepted by CL and incorporated into the graffiti.

Having a conformed format will only help in identification, and in 'economy of graffiti'. Ensuring there is a predictable portion of graffiti consumed by client identification makes it more palatable IMO. The primary downside is just the gatekeeping required to maintain the list.

I like the human readable and more verbose bits, but I think that might only be useful in CL logs, since it is behind JWT secured endpoint.

rolfyone · 2024-01-30T00:45:16Z

It seems like the other missing part that would be nice in graffiti would be the builder and version if used (mostly used, but not always)
The encoding makes sense to fit inside graffiti bytes. not sure what % of blocks use default graffiti and whether this will be useful ultimately...

michaelsproul · 2024-01-30T01:26:59Z

@rolfyone We already have data on the builders because they fill the execution payload's extraData field. E.g. this block has rsync-builder.xyz in the extra data: https://beaconcha.in/slot/8312808. The relays also provide data APIs that let us map blocks & builders to relays. Often there are multiple relays that will produce/publish each builder payload. Some of these affinities are displayed on sites like https://mevboost.pics/.

The only thing the local BN could read would be a list of relays from mev-boost or its own config. This would be 1) too long to include in graffiti and 2) redundant, given the above.

ethDreamer · 2024-01-30T02:59:32Z

Okay I just want to take the temperature of the room. Please react to this to vote:

❤️ - vote for reusing web3_clientVersion
🎉 - vote for engine_clientVersionV1
👍 - vote for adopting engine_clientVersionV1 but renaming to engine_exchangeClientVersionV1

lightclient · 2024-01-30T15:36:27Z

kasey · 2024-01-30T17:02:02Z

Prysm supports this proposal.

I filed this issue to start recording our user-agent info in graffiti by default: prysmaticlabs/prysm#13558

ethDreamer · 2024-02-02T04:17:21Z

If we sacrifice human readibility we could have the first byte representing clients (4bit cl, 4bit el) + 4bytes (2 cl, 2 el) for commits, so we end up with 5 versioning + 27 user msg

That's a nice option if we want to always provide version information while taking up minimal space. But it does limit us to only 16 execution / consensus clients, and based on my conversations a lot of people seem to prefer readability.

In practice, the versioning information is really just nice to have for some debugging cases. But it is not strictly necessary. It is much less important than knowing the implementation itself for the purposes of measuring EL client diversity. That's why I like the flexible standard because it allows users 28 characters if they really want it, while preserving readability, without limiting EL/CL client implementations, and it preserves the version information for debugging in the vast majority of cases.

ethDreamer · 2024-02-02T05:12:14Z

I've renamed the method to getClientVersionV1 in accordance with @mkalinin's suggestion. I've also decided to accommodate multiplexers by returning an array of ClientVersionV1 objects for two reasons:

We require some way to indicate to the consensus client that a multiplexer is being used so that the data can be excluded from graffiti. So we may as well indicate this by receiving more than one ClientVersionV1 object
Knowing each client version in a multiplexer scenario may be useful in the future for other methods of measuring client diversity which do not use graffiti.

smartprogrammer93

Thank you 🙏

ethDreamer · 2024-02-02T05:27:20Z

LAST CALL FOR CONCERNS

I believe we've addressed all concerns that have been raised at this point. The core idea of this PR has widespread support and most disagreement is minor & related to small implementation details. There's no reason to spend weeks bikeshedding about this optional feature.

CURRENT PLAN

wait a few more days for small fixes
merge if there are no major objections
see if issues arise during implementation and make changes as needed
after widespread implementation, consider making this method mandatory

If you agree with this plan, please give a 👍, otherwise comment your objection

lightclient · 2024-02-02T13:43:03Z

Please add spell check errors to wordlist.txt.

src/engine/identification.md

kasey · 2024-02-02T15:05:40Z

How much do we want to enforce this?

If user specifies a graffiti taking all 32 bytes, client shouldn't start anymore?
Should we give an opt-out option to regain full 32 bytes?

This is absolutely not enforced. Most of the data will come from people who don't bother to set their graffiti. Users will always have the ability to choose whatever they want for their graffiti or to not provide this data at all. But some will want to set custom graffiti while also providing data for client diversity. For those users, we want to give them more bytes so they are more likely to participate.

How does everyone plan to represent this in flags? Are we going to artificially limit the size of user-specified graffiti flags? Otherwise, what would users expect the behavior to be if they specify 20-32 bytes of graffiti"? For instance do we truncate the version string from right to left? Doing that would assume precedence in importance of the information: CL impl > CL git hash > EL impl > EL git hash. It would also be hard for software parsing this field to differentiate user-specified graffiti that flows into this section and looks like an ident/hash from the real thing.

If this is required then IMO we should limit the size of user-provided graffiti to 20 bytes and be very explicit about the fact that 1) this will be a breaking config change for users and 2) there is no opting out by "overriding" the default with a flag.

lightclient · 2024-02-02T15:21:19Z

One thing you could do is continue dropping client version data up until you can no longer store the EL+CL code combo. I think version / commit is nice to have but as critical.

kasey · 2024-02-02T15:23:55Z

One thing you could do is continue dropping client version data up until you can no longer store the EL+CL code combo. I think version / commit is nice to have but as critical.

Yeah the EL is most important (hardest to determine through other means), then CL, then versions. So if the plan is to truncate, I think we should just order it that way: (EL|CL|el-hash|cl-hash). You could get fancy and interleave the EL/CL hash bytes one-by-one but maybe that's overkill :)

ethDreamer · 2024-02-02T15:29:21Z

One thing you could do is continue dropping client version data up until you can no longer store the EL+CL code combo. I think version / commit is nice to have but as critical.

Exactly. This is what I've been referring to as a flexible standard. The version information is nice to have but not critical.

rolfyone · 2024-02-04T23:54:47Z

I'm not sure I'd interleave, but shortening does make sense. If the version of a client is present, it being in the same logical chunk will be easier to search for. eg. searching for LH1b from the example below..

user graffiti takes up 0 characters: LH1be52536BU0f91a674 user graffiti takes up 20 characters: LH1be5BU0f91 user graffiti takes up 24 characters: LH1bBU0f user graffiti takes up 28 characters: LHBU

I think this flexible standard achieves all 3 goals (human readable, max space for the user, collision resistance). But anyone else is welcome to weigh in.

I think this is a sensible approach, and if the user graffiti is beyond 28 characters just not having the data...

Do we know what percentage of blocks have more than 28 bytes of graffiti? seems like we could get a fairly good estimation of how useful this would be...

Co-authored-by: lightclient <14004106+lightclient@users.noreply.github.com>

ethDreamer · 2024-02-07T07:00:39Z

I've compiled the discussion around choosing a graffiti standard into a single document:

https://hackmd.io/@wmoBhF17RAOH2NZ5bNXJVg/BJX2c9gja

I welcome any comments. Also, there haven't been any requests for changes in days. Seems like we can merge this?

ethDreamer · 2024-02-09T04:45:58Z

@lightclient it's been about a week since people agreed we should merge this and there haven't been any objections. Is that enough time to merge?

dapplion approved these changes Jan 26, 2024

View reviewed changes

jflo approved these changes Jan 26, 2024

View reviewed changes

lightclient reviewed Jan 26, 2024

View reviewed changes

ethDreamer added 2 commits January 28, 2024 12:49

Expose web3_clientVersion on Engine API

3016309

Specify engine_clientIdentificationV1

70c3f0e

ethDreamer force-pushed the expose_client_version branch from 5e25013 to 70c3f0e Compare January 29, 2024 05:17

dapplion reviewed Jan 29, 2024

View reviewed changes

michaelsproul approved these changes Jan 29, 2024

View reviewed changes

src/engine/identification.md Outdated Show resolved Hide resolved

michaelsproul reviewed Jan 29, 2024

View reviewed changes

src/engine/identification.md Show resolved Hide resolved

ethDreamer added 3 commits January 29, 2024 14:28

Rename to engine_clientVersionV1

662693a

Also specify Grandine abbreviation and accomodate other versioning systems.

Fix broken TOC link

b05fae5

fix spelling

7d5718e

ethDreamer changed the title ~~Expose web3_clientVersion on Engine API~~ Specify Client Versions on Engine API Jan 29, 2024

tbenr mentioned this pull request Jan 29, 2024

Default graffiti should include EL version too Consensys/teku#7930

Closed

ethDreamer mentioned this pull request Jan 29, 2024

Execution Layer Meeting 180 ethereum/pm#943

Closed

rubo approved these changes Jan 29, 2024

View reviewed changes

kasey mentioned this pull request Jan 30, 2024

Prysm agent details in default graffiti prysmaticlabs/prysm#13558

Open

ethDreamer added 2 commits February 2, 2024 12:59

Rename & Accommodate Multiplexers

68bdbd6

spelling..

79ffb4e

smartprogrammer93 approved these changes Feb 2, 2024

View reviewed changes

lightclient approved these changes Feb 2, 2024

View reviewed changes

lightclient reviewed Feb 2, 2024

View reviewed changes

src/engine/identification.md Outdated Show resolved Hide resolved

lightclient reviewed Feb 2, 2024

View reviewed changes

src/engine/identification.md Outdated Show resolved Hide resolved

lightclient mentioned this pull request Feb 2, 2024

eth/catalyst,beacon/engine: implement GetClientVersionV1 ethereum/go-ethereum#28915

Merged

ethDreamer and others added 3 commits February 5, 2024 10:23

Update src/engine/identification.md

1383036

Co-authored-by: lightclient <14004106+lightclient@users.noreply.github.com>

Update src/engine/identification.md

364ad1b

Co-authored-by: lightclient <14004106+lightclient@users.noreply.github.com>

Add words to wordlist.txt

3b1e1b1

lightclient approved these changes Feb 13, 2024

View reviewed changes

lightclient merged commit d79298e into ethereum:main Feb 13, 2024
3 checks passed

ensi321 mentioned this pull request Feb 21, 2024

Include EL client info in graffiti ChainSafe/lodestar#6463

Closed

ethDreamer mentioned this pull request Feb 24, 2024

Encode Execution Engine Client Version In Graffiti sigp/lighthouse#5284

Closed

smartprogrammer93 mentioned this pull request Mar 1, 2024

Implement engine_getClientVersionV1 NethermindEth/nethermind#6801

Closed

StefanBratanov mentioned this pull request Mar 13, 2024

Append EL/CL information to user-defined graffiti Consensys/teku#8074

Closed

nflaig mentioned this pull request Jun 24, 2024

feat: include EL client info in graffiti ChainSafe/lodestar#6753

Merged

philknows mentioned this pull request Aug 26, 2024

Default graffiti for Lodestar update sigp/blockprint#36

Open

one-three-three-seven mentioned this pull request Oct 21, 2024

Feature: EL client version in the default graffiti status-im/nimbus-eth2#6668

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specify Client Versions on Engine API #517

Specify Client Versions on Engine API #517

ethDreamer commented Jan 26, 2024 •

edited

Loading

dapplion left a comment

jflo left a comment

lightclient left a comment

garyschulte commented Jan 26, 2024 •

edited

Loading

fjl commented Jan 27, 2024

ethDreamer commented Jan 29, 2024 •

edited

Loading

dapplion left a comment

michaelsproul left a comment

rkrasiuk commented Jan 29, 2024

ethDreamer commented Jan 29, 2024 •

edited

Loading

StefanBratanov commented Jan 29, 2024

rubo left a comment

lightclient commented Jan 29, 2024

garyschulte commented Jan 29, 2024

rolfyone commented Jan 30, 2024

michaelsproul commented Jan 30, 2024

ethDreamer commented Jan 30, 2024

lightclient commented Jan 30, 2024

kasey commented Jan 30, 2024

ethDreamer commented Feb 2, 2024 •

edited

Loading

ethDreamer commented Feb 2, 2024

smartprogrammer93 left a comment

ethDreamer commented Feb 2, 2024

lightclient commented Feb 2, 2024

kasey commented Feb 2, 2024

lightclient commented Feb 2, 2024

kasey commented Feb 2, 2024

ethDreamer commented Feb 2, 2024

rolfyone commented Feb 4, 2024

ethDreamer commented Feb 7, 2024

ethDreamer commented Feb 9, 2024

Specify Client Versions on Engine API #517

Specify Client Versions on Engine API #517

Conversation

ethDreamer commented Jan 26, 2024 • edited Loading

dapplion left a comment

Choose a reason for hiding this comment

jflo left a comment

Choose a reason for hiding this comment

lightclient left a comment

Choose a reason for hiding this comment

garyschulte commented Jan 26, 2024 • edited Loading

fjl commented Jan 27, 2024

ethDreamer commented Jan 29, 2024 • edited Loading

dapplion left a comment

Choose a reason for hiding this comment

michaelsproul left a comment

Choose a reason for hiding this comment

rkrasiuk commented Jan 29, 2024

ethDreamer commented Jan 29, 2024 • edited Loading

StefanBratanov commented Jan 29, 2024

rubo left a comment

Choose a reason for hiding this comment

lightclient commented Jan 29, 2024

garyschulte commented Jan 29, 2024

rolfyone commented Jan 30, 2024

michaelsproul commented Jan 30, 2024

ethDreamer commented Jan 30, 2024

lightclient commented Jan 30, 2024

kasey commented Jan 30, 2024

ethDreamer commented Feb 2, 2024 • edited Loading

ethDreamer commented Feb 2, 2024

smartprogrammer93 left a comment

Choose a reason for hiding this comment

ethDreamer commented Feb 2, 2024

lightclient commented Feb 2, 2024

kasey commented Feb 2, 2024

lightclient commented Feb 2, 2024

kasey commented Feb 2, 2024

ethDreamer commented Feb 2, 2024

rolfyone commented Feb 4, 2024

ethDreamer commented Feb 7, 2024

ethDreamer commented Feb 9, 2024

ethDreamer commented Jan 26, 2024 •

edited

Loading

garyschulte commented Jan 26, 2024 •

edited

Loading

ethDreamer commented Jan 29, 2024 •

edited

Loading

ethDreamer commented Jan 29, 2024 •

edited

Loading

ethDreamer commented Feb 2, 2024 •

edited

Loading