From 37822ab8813e91c2e029278961d258f088cd72e2 Mon Sep 17 00:00:00 2001
From: Diko Parvanov <diko.parvanov@canonical.com>
Date: Tue, 28 Mar 2023 16:13:51 +0300
Subject: [PATCH] Changed unit_unavailable interval for prometheus

As stated in issue https://github.com/canonical/bundle-kubeflow/issues/564
the duation for alerts for argo is set to 0m, which is too low for prod
environments. We need to change to at least 5m to prevent the flapping behavior.

Partial-Bug: https://github.com/canonical/bundle-kubeflow/issues/564
---
 src/prometheus_alert_rules/unit_unavailable.rule | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/prometheus_alert_rules/unit_unavailable.rule b/src/prometheus_alert_rules/unit_unavailable.rule
index 06b7464..93a89e8 100644
--- a/src/prometheus_alert_rules/unit_unavailable.rule
+++ b/src/prometheus_alert_rules/unit_unavailable.rule
@@ -1,6 +1,6 @@
 alert: TrainingOperatorUnitIsUnavailable
 expr: up < 1
-for: 0m
+for: 5m
 labels:
   severity: critical
 annotations: