Add Mimir Alertmanager alerts #1472

TheoBrigitte · 2025-01-20T12:12:56Z

Towards: giantswarm/roadmap#3752

Adding some Mimir Alertmanager alerts to detect failures.

This a bit of a random pick, but we need something to start with and evaluate if things are running smoothly.

Those alerts are taken from upstream, therefore using upstream ops recipes until we have better ones.

QuentinBisson · 2025-01-20T12:36:56Z

This PR is missing the changelog entry as well as tests for the alerts.

On top of that, could you add a comment explaining that those are coming from upstream like so?

Does the dashboard that is linked in the alert exists in our grafanas?

Stopping here but for the mimir alerts we put them all into the mimir alerts using another rule group:

prometheus-rules/helm/prometheus-rules/templates/platform/atlas/alerting-rules/mimir.rules.yml

Line 209 in 61773b2

- name: mimir.continuous-test

I would advocate that we do the same :)

TheoBrigitte · 2025-01-20T12:45:53Z

This PR is missing the changelog entry as well as tests for the alerts.

Added a changelog entry, will add unit tests

On top of that, could you add a comment explaining that those are coming from upstream like so?

Added a comment with a link to upstream

Does the dashboard that is linked in the alert exists in our grafanas?

The dashboard uid is valid and points to our Mimir Alertmanager dashboard

Stopping here but for the mimir alerts we put them all into the mimir alerts using another rule group:

I though having those alerts in another files would separate things and avoid a too long file for mimir.rules, but I moved them in there.

TheoBrigitte · 2025-01-20T12:47:23Z

test/hack/bin/check-opsrecipes.sh

-            if ! isInArray "$opsrecipe" "${opsRecipes[@]}"; then
+            # or is a valid URL starting with http
+            if ! isInArray "$opsrecipe" "${opsRecipes[@]}" && [[ "$opsrecipe" != http* ]]; then


What do we think about this ?

QuentinBisson · 2025-01-20T13:03:55Z

I'm fine with splitting the alerts into multiple files but I would rather we have them in a folder then and that we do it for all alerts :)

Add Mimir Alertmanager alerts

fb206c4

TheoBrigitte self-assigned this Jan 20, 2025

TheoBrigitte requested a review from a team as a code owner January 20, 2025 12:12

TheoBrigitte added 4 commits January 20, 2025 13:40

allow full link into opsrecipes

8aac184

update changelog

71224b4

add upstream comment

a637577

move alerts to mimir.rules.yml

0ea78c9

script comment

075e3f2

TheoBrigitte commented Jan 20, 2025

View reviewed changes

QuentinBisson approved these changes Jan 20, 2025

View reviewed changes

add unit test, remove job label

1625e3e

TheoBrigitte merged commit 15b1ee1 into main Jan 20, 2025
7 checks passed

TheoBrigitte deleted the mimir-alertmanager-alerts branch January 20, 2025 13:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Mimir Alertmanager alerts #1472

Add Mimir Alertmanager alerts #1472

TheoBrigitte commented Jan 20, 2025 •

edited

Loading

QuentinBisson commented Jan 20, 2025

TheoBrigitte commented Jan 20, 2025

TheoBrigitte Jan 20, 2025

QuentinBisson Jan 20, 2025

QuentinBisson commented Jan 20, 2025

Add Mimir Alertmanager alerts #1472

Add Mimir Alertmanager alerts #1472

Conversation

TheoBrigitte commented Jan 20, 2025 • edited Loading

QuentinBisson commented Jan 20, 2025

TheoBrigitte commented Jan 20, 2025

TheoBrigitte Jan 20, 2025

Choose a reason for hiding this comment

QuentinBisson Jan 20, 2025

Choose a reason for hiding this comment

QuentinBisson commented Jan 20, 2025

TheoBrigitte commented Jan 20, 2025 •

edited

Loading