topolvm operator is failed to push monitoring metrics into prometheus #66

GowthamShanmugam · 2021-11-15T07:57:02Z

topolvm operator is not pushing any monitoring metrics out of operator and nodes pod, There is no way to create an alert and alerting rules for topolvm in Kubernetes / Openshift.

GowthamShanmugam · 2021-11-15T07:57:58Z

What is missing:

services for the service monitor to fetch the metrics
Roles and role binding

little-guy-lxr · 2021-11-15T09:03:44Z

@GowthamShanmugam service has added. see #63

GowthamShanmugam · 2021-11-15T14:36:13Z

ack will test with latest master once again.

leelavg · 2021-11-22T12:05:00Z

@little-guy-lxr can you pls add commit #63 to origin-topolvm branch or can I raise the cherry-pick PR?

little-guy-lxr · 2021-11-23T02:13:47Z

topolvm

OK, I will cherry pick the commit to origin-topolvm

GowthamShanmugam · 2021-11-23T13:07:47Z

with the latest branch monitoring is not working, service monitoring is not created, do I need to create it manually? I can see everything is working fine when i am using alaudapublic/topolvm-operator:2.2.0. But with the main branch custom build it is not working.

GowthamShanmugam · 2021-11-23T14:20:05Z

Is there any reason we stopped calling EnableServiceMonitor function and CreateOrUpdatePrometheusRule?

1893632#diff-9a6acdebbd30f8b93285ecd76b832d3e4cd34cb58f06a4dbe292f1e849a3f332L263

GowthamShanmugam · 2021-11-23T18:16:10Z

This Pr is fixing service monitoring and alerting rule creation but still metrics are not getting populated: #87

Metrics are coming only if i create namespace level role and role-binding

little-guy-lxr · 2021-11-24T07:25:21Z

@GowthamShanmugam How do you deploy topolvm operator. did you use the Yaml in https://github.com/alauda/topolvm-operator/tree/main/deploy/example ?

GowthamShanmugam · 2021-11-24T08:00:37Z

yes, i used YAMLS

little-guy-lxr · 2021-11-24T08:40:41Z

@GowthamShanmugam please paste the log of topolvm operator. is your platform kubernetes/openshit ?

GowthamShanmugam · 2021-11-24T09:37:00Z

openshift, I saw metrics are getting populated while using alaudapublic/topolvm-operator:2.2.0 on openshift. But with the latest main branch not working. i will add logs.

GowthamShanmugam · 2021-11-29T21:22:36Z

i checked with the latest master this issue is still there, i dont find logs which is related to metrics

2021-11-29 21:21:47.947364 D | status: node ip-10-0-142-55.ec2.internal, phase: Ready
2021-11-29 21:21:47.947390 D | status: no need to update cluster status
2021-11-29 21:21:47.947399 D | op-k8sutil: creating servicemonitor topolvm-service-monitor
W1129 21:21:47.947413       1 client_config.go:615] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2021-11-29 21:21:47.960589 D | op-k8sutil: creating prometheusRule topolvm-alert
W1129 21:21:47.960618       1 client_config.go:615] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.

GowthamShanmugam · 2021-11-29T21:27:50Z

Prometheus log:

ts=2021-11-29T21:09:55.282Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:447: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"topolvm-system\""

ts=2021-11-29T21:09:55.283Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:449: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"topolvm-system\""

ts=2021-11-29T21:09:55.283Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:448: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"topolvm-system\""

little-guy-lxr · 2021-11-30T02:41:30Z

i checked with the latest master this issue is still there, i dont find logs which is related to metrics

2021-11-29 21:21:47.947364 D | status: node ip-10-0-142-55.ec2.internal, phase: Ready
2021-11-29 21:21:47.947390 D | status: no need to update cluster status
2021-11-29 21:21:47.947399 D | op-k8sutil: creating servicemonitor topolvm-service-monitor
W1129 21:21:47.947413       1 client_config.go:615] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2021-11-29 21:21:47.960589 D | op-k8sutil: creating prometheusRule topolvm-alert
W1129 21:21:47.960618       1 client_config.go:615] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.

@GowthamShanmugam check the ServiceMonitor created or not.

little-guy-lxr · 2021-11-30T02:54:51Z

Prometheus log:

ts=2021-11-29T21:09:55.282Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:447: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"topolvm-system\""

ts=2021-11-29T21:09:55.283Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:449: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"topolvm-system\""

ts=2021-11-29T21:09:55.283Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:448: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"topolvm-system\""

topolvm operator create servicemonitor in the own namesapce( this case is topolvm-system). but your prometheus may has no permission to access this namespace. Maybe ocp platform limit the user must create the Servicemonitor in the namespace that prometheus own. please check.

GowthamShanmugam · 2021-11-30T07:46:49Z

You are right, Openshift Prometheus needs permission to access topolvm-system namespace. When I created role and role binding with all required permissions then it started working fine.

little-guy-lxr linked a pull request Nov 23, 2021 that will close this issue

add service metric && fix chart #82

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

topolvm operator is failed to push monitoring metrics into prometheus #66

topolvm operator is failed to push monitoring metrics into prometheus #66

GowthamShanmugam commented Nov 15, 2021

GowthamShanmugam commented Nov 15, 2021

little-guy-lxr commented Nov 15, 2021

GowthamShanmugam commented Nov 15, 2021

leelavg commented Nov 22, 2021

little-guy-lxr commented Nov 23, 2021

GowthamShanmugam commented Nov 23, 2021 •

edited

Loading

GowthamShanmugam commented Nov 23, 2021 •

edited

Loading

GowthamShanmugam commented Nov 23, 2021

little-guy-lxr commented Nov 24, 2021

GowthamShanmugam commented Nov 24, 2021

little-guy-lxr commented Nov 24, 2021

GowthamShanmugam commented Nov 24, 2021 •

edited

Loading

GowthamShanmugam commented Nov 29, 2021

GowthamShanmugam commented Nov 29, 2021

little-guy-lxr commented Nov 30, 2021

little-guy-lxr commented Nov 30, 2021

GowthamShanmugam commented Nov 30, 2021

topolvm operator is failed to push monitoring metrics into prometheus #66

topolvm operator is failed to push monitoring metrics into prometheus #66

Comments

GowthamShanmugam commented Nov 15, 2021

GowthamShanmugam commented Nov 15, 2021

little-guy-lxr commented Nov 15, 2021

GowthamShanmugam commented Nov 15, 2021

leelavg commented Nov 22, 2021

little-guy-lxr commented Nov 23, 2021

GowthamShanmugam commented Nov 23, 2021 • edited Loading

GowthamShanmugam commented Nov 23, 2021 • edited Loading

GowthamShanmugam commented Nov 23, 2021

little-guy-lxr commented Nov 24, 2021

GowthamShanmugam commented Nov 24, 2021

little-guy-lxr commented Nov 24, 2021

GowthamShanmugam commented Nov 24, 2021 • edited Loading

GowthamShanmugam commented Nov 29, 2021

GowthamShanmugam commented Nov 29, 2021

little-guy-lxr commented Nov 30, 2021

little-guy-lxr commented Nov 30, 2021

GowthamShanmugam commented Nov 30, 2021

GowthamShanmugam commented Nov 23, 2021 •

edited

Loading

GowthamShanmugam commented Nov 23, 2021 •

edited

Loading

GowthamShanmugam commented Nov 24, 2021 •

edited

Loading