-
-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Organize and Update Metrics #7098
Comments
We should be implementing at least the interop metrics which seems like we do. Other metrics I would probably not add if nobody is asking for it and we don't deem them useful.
nesting seems useful to differentiate metrics and add more context imo
I think we should prefix non-spec'd metrics with
was thinking about that too at one point but there is really not that much code which is why I decided to add the metric types required by a lot of packages to packages/utils and the code to run the metrics server, create registry, etc. seemed fine to keep in beacon-node and wire it up in packages/cli.
Where is this the case? Last time I looked at this #6201 and refactored / moved the code around, I removed all duplications and types should only be imported from utils package.
Seems reasonable to split it up into multiple files although I don't think the reason the metrics grow over time is because of using a single file. I think the problem is rather that we don't remove metrics unless the code is removed, but we could re-evaluate some metrics added during initial implementation and remove them if the data does not seem necessary, especially if those metrics are not even part of any panel. |
I tend to disagree. I think we should meet the ethereum/beacon-metrics spec just like the other specs. We cannot be sure who or why is expecting those so it is best to include them. The cost to do so is minimal and its the right thing to do as good stewards of the protocol
👍
I think we should use "beacon" similar to the spec'd metrics. There is no reason to differentiate ones that are "lodestar specific beacon metrics" vs "beacon metrics." Those feel like one and the same. But this is not a hill i would die on and am open to suggestions.
I think creating a separation of concerns is a nice to have here. In particular with regards to the next response about circular dependency.
At a minimum I know there are metrics in
Agreed. Seems like a nice to have so its more maintainable than 1000+ line files... Will also make it more meaningful to maintain (ie remove unused metrics) if its easier to grok what is cooking. |
The problem with the beacon-metrics seems that it is not really well maintained, we might wanna clean up there first, see which metrics are actually implemented by other clients at this point.
It's good to know as a consumer, assuming all clients implement the standard beacon metrics you know which ones are interchangeable and which ones are client specific.
you mean similar to what we do in reqresp?
|
Correct. That is the heart of this issue/effort.
I suppose I see both sides of this. The only consumers of these metrics should be dashboards and in an idea world the dashboards should look the similar across clients. It would be great to be able to standardize metrics methodology across all clients and then for implementation specific dashboards the data would just not show up. IE GC and event loop would not propagate on lighthouse but all the fork choice ones should be similar and plug-and-play Prefixing with This will help to push the standardization effort that should be a thing for all Ethereum. We could be thought leaders in this space if-so... I have started to add context and am interested in working more on the
This is exactly the issue that I think should be solved. All metrics should be in one place to avoid confusion and duplication. IE we have metrics definitions in several places:
In particular the most egregious is the state-transition interface that requires duplication for type/interface and beacon-node based implementation. The other thing that stands out to me is that we have
The stuff I am also not sure how to handle, but think should also be dealt with is the worker metrics. Those will need a separate instance of the |
Not sure I agree here, e.g. in the monitoring service it makes much more sense to define the metrics in the constructor instead of somewhere central. This makes it much easier to see where they are used and only adds them if the monitoring service is enabled, I like this pattern. |
I suppose monitoring service could be separate, but not sure why it should be. Its just as easy to have a few exports from But big picture is that we have a unified place "where metrics live" kind of like we have a unified place "that dashboards live" |
I think the bigger picture though is for something like gossipsup metrics. I am trying to add the ones requested for peerDas and having a hard time with it. Need to dig through how all the metrics are added and still not entirely sure but I know that they need to be added here: However gossip lives both on the worker thread and on the main thread depending on what stage of the process one is attempting to get telemetry from. Adding the data collectors and then getting access to them is proving a digging exercise.... Having a unified place where they live will also lend itself to a README.md that could explain how/where/why to add metrics so things wire together correctly. |
I think we should discuss this tomorrow as well with additional perspectives, but to summarize from what I'm seeing so far with the proposed solutions: Creating packages/metrics FolderArguments For:
Arguments Against:
Questions:
Nesting MetricsArguments For:
Arguments Against:
Questions
File StructureArguments For:
Arguments Against:
Questions
Lodestar Metric PrefixArguments For:
Arguments Against:
Questions:
|
Problem description
There are a few goals that this issue is meant to discuss/cover.
Our collection of metrics has grown over time and the seems to be a lot of technical debt that needs to be addressed. The big thing is the organization of the metrics we currently collect. We have two giant files that are not very well organized so it is becoming harder to maintain. I would like to propose cleaning up the organization by shuffling the metrics into a few separate files so that they are easier to maintain.
There are also a number of metrics that are prescribed in
ethereum/beacon-metrics
that we are not collected as part of our metrics program. Ideally these will also get added.There is also another creeping issue that I think should be addressed. We are starting to collect metrics is folders other than beacon-node and having the
Metrics
type there creates a circular dependency.Solution description
Creating
packages/metrics
FolderI would like to suggest that we move the metrics out of the
beacon-node
package and into its own package. This will make it easier to import theMetrics
interface into other packages. I also am thinking we can export a singleton metrics object from the package. The package could export a getter function that obtains the instance of the singletonmetrics
object for addition to thechain
object as it is today. But this would open the possibility of getting the metrics object in any help function or the other packages without needing to pass thechain
object around. Not sure if this is strictly necessary but might make our lives easier going forward.As an example we have metrics in
reqresp
and instate-transition
that could easily be centralized.Nesting Metrics
I am not sure if added a lot of nested domains will make using the
metrics
object easier or more complicated. This is the first thing that should be decided. Is it better to nest similar metrics within domains, under the metrics object or should we keep a flat structure so its easier to find what one is looking for using intellisensemetrics.forkChoice.findHeadSeconds
vsmetrics.findHeadSeconds
My gut tells me that some minimal nesting will be nice due to the large number of metrics that we collect but using a deeply nested structure may become a pain to deal with as some metrics tend to cross domain boundaries and it will be challenging to decide "where" to put things.
metrics.chain.bls.multithread.aggregatedPublicKeysCount()
vsmetrics.bls.aggregatedPublicKeys
I actually kinda flip-flopped on this while writing this up so its worth discussing further as I'm not really sure which will be best myself.
File Structure
I think we really need to split up the two giant files into a number of smaller files so its easier to maintain. Seems like an easy win.
lodestar
Metric PrefixThis seems a bit overkill to add to basically every single metric. We only collect metrics on "lodestar" which is a "beacon" node so separating the two seems silly. The only reason would be to differentiate "ethereum standard" metrics from our lodestar specifics but I am not sure that is necessary. I do think however that we should be prefixing with the "domain" a metric represents and having that domain match the filename seems like a nice to have so its clear where it is without having to do a global file search.
Additional context
No response
The text was updated successfully, but these errors were encountered: