Skip to content

Commit

Permalink
Changed unit_unavailable interval for prometheus
Browse files Browse the repository at this point in the history
As stated in issue canonical/bundle-kubeflow#564
the duation for alerts for argo is set to 0m, which is too low for prod
environments. We need to change to at least 5m to prevent the flapping behavior.

Partial-Bug: canonical/bundle-kubeflow#564
  • Loading branch information
dparv committed Mar 28, 2023
1 parent 2149ca3 commit 37822ab
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/prometheus_alert_rules/unit_unavailable.rule
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
alert: TrainingOperatorUnitIsUnavailable
expr: up < 1
for: 0m
for: 5m
labels:
severity: critical
annotations:
Expand Down

0 comments on commit 37822ab

Please sign in to comment.