Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics for GitRepo status #247

Closed
pcallewaert opened this issue Jan 28, 2021 · 9 comments
Closed

Metrics for GitRepo status #247

pcallewaert opened this issue Jan 28, 2021 · 9 comments

Comments

@pcallewaert
Copy link

I'm setting up a POC internally with fleet and k3s to deploy helm charts to edge devices. It works pretty good, but unfortunately I am currently missing feedback if the sync is successful. For example, one of the devices had a firewall issue to connect to GitHub.
I was wondering how this is mostly monitored? I was hoping to find a prometheus endpoint that exposed the status of a GitRepo config.
Also, I noticed by browsing through some issues that you can visualize some information with rancher, but currently I did not add this to our stack.

Thanks!

@ADustyOldMuffin
Copy link

I would also like this as a feature, and would be interested in taking this on as we currently need it.

@atsai1220
Copy link

Having Prometheus metrics would be incredibly useful when managing multiple downstream clusters. I would like Alertmanager to send Slack alerts when a specific bundle has been in a modified state for over x minutes with the specific cluster name.

It's true that downstream clusters with Prometheus and Alertmanager should be able to alert on unhealthy pods but there is no good way to determine if the bundle is missing or has modified resources unless determined upstream (from the Rancher console cluster).

@strowi
Copy link

strowi commented Dec 28, 2021

Same here. previously we've always been using some ci/cd solution to actively deploy to kubernetes. An error in e.g. gitlab pipeline will notify the user.
This doesn't seem to be the way with fleet. An error in a deployment yaml would not surface anywhere.
Maybe it would also be a good idea to have fleet notify directly via mail or webhook since prometheus will always have some delay?

@ADustyOldMuffin
Copy link

@strowi you can use kubectl wait to see if it was successful which is what we use.

We wait for X amount of time and then use kubectl wait and check for the Ready status with a timeout.

@strowi
Copy link

strowi commented Dec 28, 2021

@ADustyOldMuffin Thank you for the suggestion, will check that too. To be sure you're talking about the fleet-cli right?
I came here initially from using the rancher-ui continuous-delivery part, which just connects a repo with a cluster and does monitor this passively (much like argocd i guess).

Ah just found #317 which sounds a lot like what i would like.;)

@ADustyOldMuffin
Copy link

ADustyOldMuffin commented Dec 28, 2021

Nope, if you're using GitOps you could have your pipeline trigger on commit and then wait on the git repo to be in a ready state or the bundles if you know the bundles.

That being said, I am also working in this myself as we would like to use alert manager as well with prometheus for this.

@strowi
Copy link

strowi commented Dec 28, 2021

Right, that might be possible. Need to test and check that out a little more in context of multi-cluster and k8s-context.
I don't want to expose the credentials to a ci/cd-system outside of rancher (possibly vs. rancher pipeline).

@marthydavid
Copy link

ping we would like to get metrics from fleet

@manno
Copy link
Member

manno commented Mar 18, 2024

Closing in favor of #1408

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants