Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add solana_node_is_active metric #84

Merged
merged 1 commit into from
Jan 15, 2025

Conversation

andreclaro
Copy link
Contributor

@andreclaro andreclaro commented Dec 30, 2024

Summary

feat: add solana_node_is_active metric

Details

add the solana_node_is_active metric based on the node keys provided.

This feature will help identify the active validator node(s), serving as the first step toward replacing solana-mission-control, the service currently in use. We implemented a similar feature in solana-mission-control a few months ago: andreclaro/solana-mission-control@a512d72

Testing

  • Added Unit tests
$ go test -v ./cmd/solana-exporter
...
PASS
ok      github.com/asymmetric-research/solana-exporter/cmd/solana-exporter      (cached)
$ go test ./pkg/rpc  -v  -cover  -run TestClient_GetIdentity
=== RUN   TestClient_GetIdentity
--- PASS: TestClient_GetIdentity (0.00s)
PASS
coverage: 27.1% of statements
ok      github.com/asymmetric-research/solana-exporter/pkg/rpc  0.231s  coverage: 27.1% of statements

Tested in our testnet nodes:

  • Active:
# HELP solana_node_is_active Whether the node is active and participating in consensus (using identity pubkey)
# TYPE solana_node_is_active gauge
solana_node_is_active{identity="xLabscif2DLnYg39rQThqi7A9E45L9qiysRZhmZ1ARE"} 1

Passive:

# HELP solana_node_is_active Whether the node is active and participating in consensus (using identity pubkey)
# TYPE solana_node_is_active gauge
solana_node_is_active{identity="25rC66cyPRLa6ghxD24eLUgorimcSAAWoPe1r6VoFUPc"} 0

@qedgardo
Copy link
Contributor

qedgardo commented Jan 8, 2025

It looks great @andreclaro !

Copy link

@Alex99y Alex99y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@johnstonematt johnstonematt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit, I don't quite understand this metric. getIdentity will return the identity of the node that the validator is pointed to, and then all we do is check if that identity is one of the configured -nodekey's? I don't quite see how that matches the descriptive of "whether or not the nodekey is active and participating in consensus"

All that being said, I could be misunderstanding completely, so feel free to harshly correct.

@andreclaro
Copy link
Contributor Author

I must admit, I don't quite understand this metric. getIdentity will return the identity of the node that the validator is pointed to, and then all we do is check if that identity is one of the configured -nodekey's? I don't quite see how that matches the descriptive of "whether or not the nodekey is active and participating in consensus"

All that being said, I could be misunderstanding completely, so feel free to harshly correct.

@johnstonematt , We operate two mainnet nodes: one is active, and the other is passive.

  • Active Node: This node uses an identity key associated with a vote account or a staked identity account, enabling it to participate in the solana consensus (voting and producing blocks).
  • Passive Node: This node operates with an unstaked identity key (not linked to a vote account) and, therefore, cannot participate in consensus.

Failover between the active and passive nodes is managed through a mechanism similar to the Identity Transition process used by Pumpkins Pool, using set-identity feature. You can find more details on this process in their documentation: https://pumpkins-pool.gitbook.io/pumpkins-pool

Consequently, we need to know which node is active and passive at every single time:

  • Node 01
    image

  • Node 02
    image

As I mentioned in the description, I implemented a similar feature in solana-mission-control a few months ago: andreclaro/solana-mission-control@a512d72

Aren't you using a similar mechanism? How are you tracking active and passive node?

CC: @SEJeff

@johnstonematt
Copy link
Contributor

I think the issue here is a misunderstanding of the intended use of the -nodekey config parameter. Previously, in the old exporter, metrics such as skip rate and active stake were tracked for ALL validators. This, as one can imagine, created ridiculous amounts of metrics. So, to fix this, one can now specify specific nodekey's (i.e., validator identity addresses) to track these metrics for.

I like the idea of tracking whether a validator is active, but I don't think the nodekey config parameter is the way to go about this. Can we change this PR, to add a new config parameter, call it something like main-nodekey or active-nodekey and then instead have this metric compare against that? Or something along those lines.

What do you think? @andreclaro @SEJeff

Also, note that I merged in #83

@andreclaro
Copy link
Contributor Author

I think the issue here is a misunderstanding of the intended use of the -nodekey config parameter. Previously, in the old exporter, metrics such as skip rate and active stake were tracked for ALL validators. This, as one can imagine, created ridiculous amounts of metrics. So, to fix this, one can now specify specific nodekey's (i.e., validator identity addresses) to track these metrics for.

I like the idea of tracking whether a validator is active, but I don't think the nodekey config parameter is the way to go about this. Can we change this PR, to add a new config parameter, call it something like main-nodekey or active-nodekey and then instead have this metric compare against that? Or something along those lines.

What do you think? @andreclaro @SEJeff

Also, note that I merged in #83

Yes, agree and I can add that new config parameter and update the logic. I'm just not sure about the name for this new parameter.

@andreclaro andreclaro force-pushed the node_is_active branch 7 times, most recently from 2384750 to 4273fa6 Compare January 15, 2025 13:19
@johnstonematt johnstonematt merged commit 6311a4c into asymmetric-research:master Jan 15, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants