Skip to content

Commit

Permalink
Changed unit_unavailable interval for prometheus (#100)
Browse files Browse the repository at this point in the history
As stated in issue canonical/bundle-kubeflow#564
the duation for alerts for argo is set to 0m, which is too low for prod
environments. We need to change to at least 5m to prevent the flapping behavior.

Partial-Bug: canonical/bundle-kubeflow#564

Co-authored-by: Diko Parvanov <diko.parvanov@canonical.com>
  • Loading branch information
dparv and dparv authored Mar 29, 2023
1 parent 2149ca3 commit 72fd827
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/prometheus_alert_rules/unit_unavailable.rule
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
alert: TrainingOperatorUnitIsUnavailable
expr: up < 1
for: 0m
for: 5m
labels:
severity: critical
annotations:
Expand Down

0 comments on commit 72fd827

Please sign in to comment.