Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from grpc-ecosystem/go-grpc-prometheus to grpc-ecosystem/go-grpc-middleware/providers/prometheus #19195

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dims
Copy link
Contributor

@dims dims commented Jan 14, 2025

Reviving previous effort from: #17974

xref: kubernetes/kubernetes#128583

Added a new test to make sure we are not missing any expected metrics.

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@k8s-ci-robot
Copy link

Hi @dims. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@dims dims force-pushed the switch-from-grpc-ecosystem/go-grpc-prometheus-to-grpc-ecosystem/go-grpc-middleware/providers/prometheus-take-2 branch from f7123cd to f80c5b2 Compare January 14, 2025 22:09
@ivanvc
Copy link
Member

ivanvc commented Jan 14, 2025

/ok-to-test

@dims
Copy link
Contributor Author

dims commented Jan 14, 2025

thanks @ivanvc

Copy link

codecov bot commented Jan 14, 2025

Codecov Report

Attention: Patch coverage is 88.37209% with 5 lines in your changes missing coverage. Please review.

Project coverage is 68.80%. Comparing base (8731c31) to head (db2e7b7).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
server/etcdmain/grpc_proxy.go 0.00% 4 Missing ⚠️
server/etcdserver/api/v3rpc/metrics.go 96.55% 1 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
server/config/config.go 80.23% <ø> (ø)
server/embed/etcd.go 76.79% <100.00%> (+0.30%) ⬆️
server/etcdserver/api/v3rpc/grpc.go 100.00% <100.00%> (ø)
server/etcdserver/api/v3rpc/metrics.go 97.05% <96.55%> (-2.95%) ⬇️
server/etcdmain/grpc_proxy.go 14.44% <0.00%> (-0.08%) ⬇️

... and 23 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #19195      +/-   ##
==========================================
- Coverage   68.85%   68.80%   -0.06%     
==========================================
  Files         420      420              
  Lines       35680    35716      +36     
==========================================
+ Hits        24569    24575       +6     
- Misses       9689     9719      +30     
  Partials     1422     1422              

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8731c31...db2e7b7. Read the comment docs.

@dims
Copy link
Contributor Author

dims commented Jan 14, 2025

/test pull-etcd-integration-2-cpu-amd64

@dims
Copy link
Contributor Author

dims commented Jan 15, 2025

/assign @ahrtr @serathius

server/embed/etcd.go Outdated Show resolved Hide resolved
@ahrtr
Copy link
Member

ahrtr commented Jan 15, 2025

Thanks @dims for the PR.

I did some sanity test on this PR, and compared it with the existing main branch.

  • Confirmed that this PR can generate the same gRPC counter metrics as the existing main branch, including
    • grpc_server_handled_total
    • grpc_server_msg_received_total
    • grpc_server_msg_sent_total
    • grpc_server_started_total
  • It couldn't generate the histograms metrics, but the existing main branch can.
    • grpc_server_handling_seconds_bucket
    • grpc_server_handling_seconds_count
    • grpc_server_handling_seconds_sum

Also references:

@serathius
Copy link
Member

Do we need a test to confirm that no metric was removed?

@dims
Copy link
Contributor Author

dims commented Jan 17, 2025

It couldn't generate the histograms metrics, but the existing main branch can.
grpc_server_handling_seconds_bucket
grpc_server_handling_seconds_count
grpc_server_handling_seconds_sum

@ahrtr did you run with --metrics 'extensive' option? or let it default to 'basic'?

@dims
Copy link
Contributor Author

dims commented Jan 17, 2025

Do we need a test to confirm that no metric was removed?

i think so for future-proofing!

@ahrtr
Copy link
Member

ahrtr commented Jan 17, 2025

@ahrtr did you run with --metrics 'extensive' option?

YES, I executed the same command on this PR and the main branch. The main branch was working as expected, but this PR did not generate the histograms metrics.

@dims dims force-pushed the switch-from-grpc-ecosystem/go-grpc-prometheus-to-grpc-ecosystem/go-grpc-middleware/providers/prometheus-take-2 branch from f80c5b2 to 03554ca Compare January 17, 2025 19:44
@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dims
Once this PR has been reviewed and has the lgtm label, please ask for approval from ahrtr. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dims
Copy link
Contributor Author

dims commented Jan 17, 2025

@ahrtr found the issue and hopefully fixed it. Added a test as well. However, please check if i broke anything in the process of threading the option through to both the test suite and the main binary.

@dims dims force-pushed the switch-from-grpc-ecosystem/go-grpc-prometheus-to-grpc-ecosystem/go-grpc-middleware/providers/prometheus-take-2 branch from 03554ca to cafe302 Compare January 17, 2025 20:02
@dims dims force-pushed the switch-from-grpc-ecosystem/go-grpc-prometheus-to-grpc-ecosystem/go-grpc-middleware/providers/prometheus-take-2 branch 2 times, most recently from f1e0118 to c4bed32 Compare January 17, 2025 23:00
@dims
Copy link
Contributor Author

dims commented Jan 17, 2025

/test pull-etcd-robustness-arm64

@dims
Copy link
Contributor Author

dims commented Jan 17, 2025

Do we need a test to confirm that no metric was removed?

@serathius Done! see new test case. If there are other metrics we can trigger them and then add to the list.

@dims dims force-pushed the switch-from-grpc-ecosystem/go-grpc-prometheus-to-grpc-ecosystem/go-grpc-middleware/providers/prometheus-take-2 branch 4 times, most recently from efbc3fc to 742ce69 Compare January 18, 2025 03:42
@dims dims force-pushed the switch-from-grpc-ecosystem/go-grpc-prometheus-to-grpc-ecosystem/go-grpc-middleware/providers/prometheus-take-2 branch from 742ce69 to da7dd38 Compare January 21, 2025 12:07
@dims dims mentioned this pull request Jan 21, 2025
@dims
Copy link
Contributor Author

dims commented Jan 21, 2025

@serathius @ahrtr the follow up PR once this merges will be https://github.com/etcd-io/etcd/pull/19242/files

@dims
Copy link
Contributor Author

dims commented Jan 21, 2025

/test pull-etcd-unit-test-386

@ahrtr
Copy link
Member

ahrtr commented Jan 23, 2025

@serathius @ahrtr the follow up PR once this merges will be https://github.com/etcd-io/etcd/pull/19242/files

@dims please see my comment #19242 (comment), thx

@dims dims force-pushed the switch-from-grpc-ecosystem/go-grpc-prometheus-to-grpc-ecosystem/go-grpc-middleware/providers/prometheus-take-2 branch from da7dd38 to c94b3be Compare January 23, 2025 17:11
@dims
Copy link
Contributor Author

dims commented Jan 23, 2025

@ahrtr after #19242 merged, i've rebased and updated this PR as well.

"project": "github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors",
"licenses": [
{
"type": "\"Do What The F*ck You Want To Public License\"",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know where is this coming from? @ivanvc @jmhbnz @dims

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrtr ./scripts/updatebom.sh is generating it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github.com/grpc-ecosystem/go-grpc-middleware/v2 doesn't use this license.

Looks like it's coming from https://github.com/appscodelabs/license-bill-of-materials/blob/master/assets/wtfpl.txt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrtr correct! look at the confidence score "confidence": 0.14814814814814814

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrtr do you want me to add an entry in override.json?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like their LICENSE is actually Apache License 2.0 (ref: https://github.com/grpc-ecosystem/go-grpc-middleware/blob/main/LICENSE). I think it would be reasonable to add it to bill-of-materials.override.json.

This is a good example of why we should prioritize a better tool to generate the BOM (#18902).

Comment on lines +266 to +268
"etcd_disk_backend_commit_duration_seconds",
"etcd_disk_backend_defrag_duration_seconds",
"etcd_disk_backend_snapshot_duration_seconds",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a breaking change? It changes the metrics names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope! the _bucket, _count, _sum suffixes are dropped as i am parsing them correctly using the official parser.

you can see lines 266-268 have the correct names.

"grpc_client_handled_total",
"grpc_client_msg_received_total",
"grpc_client_msg_sent_total",
"grpc_client_started_total",
"grpc_server_handled_total",
"grpc_server_handling_seconds",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the extra metric when we turn on extensive

dims added 2 commits January 25, 2025 19:59
…pc-middleware/providers/prometheus

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
@dims dims force-pushed the switch-from-grpc-ecosystem/go-grpc-prometheus-to-grpc-ecosystem/go-grpc-middleware/providers/prometheus-take-2 branch from c94b3be to 5b835c7 Compare January 26, 2025 01:00
@dims
Copy link
Contributor Author

dims commented Jan 26, 2025

@ahrtr @serathius rebased!

@ahrtr
Copy link
Member

ahrtr commented Jan 26, 2025

@ahrtr @serathius rebased!

We need to add an e2e test to prevent any regression. The integration test might not be reliable to verify this as mentioned in #19242 (comment). I will do it sometime next week, but please feel free to add it if you have bandwidth.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
@dims dims force-pushed the switch-from-grpc-ecosystem/go-grpc-prometheus-to-grpc-ecosystem/go-grpc-middleware/providers/prometheus-take-2 branch from 5b835c7 to db2e7b7 Compare January 26, 2025 12:39
@dims
Copy link
Contributor Author

dims commented Jan 26, 2025

/retest

@k8s-ci-robot
Copy link

@dims: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-etcd-e2e-arm64 db2e7b7 link true /test pull-etcd-e2e-arm64
pull-etcd-e2e-386 db2e7b7 link true /test pull-etcd-e2e-386
pull-etcd-e2e-amd64 db2e7b7 link true /test pull-etcd-e2e-amd64

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants