Skip to content

Commit

Permalink
IngesterInstanceHasNoTenants alert: exclude read only replicas (#9843)
Browse files Browse the repository at this point in the history
* IngesterInstanceHasNoTenants alert: exclude read only replicas

It's expected that read-only ingesters might not have tenants, there's no need to alert on that.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* make build-helm-tests

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

---------

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
  • Loading branch information
colega authored Nov 6, 2024
1 parent 009e2f1 commit 6bf0b93
Show file tree
Hide file tree
Showing 5 changed files with 21 additions and 4 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@
* [BUGFIX] Dashboards: Fix autoscaling metrics joins when series churn. #9412 #9450 #9432
* [BUGFIX] Alerts: Fix autoscaling metrics joins in `MimirAutoscalerNotActive` when series churn. #9412
* [BUGFIX] Alerts: Exclude failed cache "add" operations from alerting since failures are expected in normal operation. #9658
* [BUGFIX] Alerts: Exclude read-only replicas from `IngesterInstanceHasNoTenants` alert. #9843

### Jsonnet

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,11 @@ spec:
message: Mimir ingester {{ $labels.pod }} in {{ $labels.cluster }}/{{ $labels.namespace }} has no tenants assigned.
runbook_url: https://grafana.com/docs/mimir/latest/operators-guide/mimir-runbooks/#mimiringesterinstancehasnotenants
expr: |
(min by(cluster, namespace, pod) (cortex_ingester_memory_users) == 0)
(
(min by(cluster, namespace, pod) (cortex_ingester_memory_users) == 0)
unless
(max by(cluster, namespace, pod) (cortex_lifecycler_read_only) > 0)
)
and on (cluster, namespace)
# Only if there are more timeseries than would be expected due to continuous testing load
(
Expand Down
6 changes: 5 additions & 1 deletion operations/mimir-mixin-compiled-baremetal/alerts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,11 @@ groups:
message: Mimir ingester {{ $labels.instance }} in {{ $labels.cluster }}/{{ $labels.namespace }} has no tenants assigned.
runbook_url: https://grafana.com/docs/mimir/latest/operators-guide/mimir-runbooks/#mimiringesterinstancehasnotenants
expr: |
(min by(cluster, namespace, instance) (cortex_ingester_memory_users) == 0)
(
(min by(cluster, namespace, instance) (cortex_ingester_memory_users) == 0)
unless
(max by(cluster, namespace, instance) (cortex_lifecycler_read_only) > 0)
)
and on (cluster, namespace)
# Only if there are more timeseries than would be expected due to continuous testing load
(
Expand Down
6 changes: 5 additions & 1 deletion operations/mimir-mixin-compiled/alerts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,11 @@ groups:
message: Mimir ingester {{ $labels.pod }} in {{ $labels.cluster }}/{{ $labels.namespace }} has no tenants assigned.
runbook_url: https://grafana.com/docs/mimir/latest/operators-guide/mimir-runbooks/#mimiringesterinstancehasnotenants
expr: |
(min by(cluster, namespace, pod) (cortex_ingester_memory_users) == 0)
(
(min by(cluster, namespace, pod) (cortex_ingester_memory_users) == 0)
unless
(max by(cluster, namespace, pod) (cortex_lifecycler_read_only) > 0)
)
and on (cluster, namespace)
# Only if there are more timeseries than would be expected due to continuous testing load
(
Expand Down
6 changes: 5 additions & 1 deletion operations/mimir-mixin/alerts/alerts.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -296,7 +296,11 @@ local utils = import 'mixin-utils/utils.libsonnet';
alert: $.alertName('IngesterInstanceHasNoTenants'),
'for': '1h',
expr: |||
(min by(%(alert_aggregation_labels)s, %(per_instance_label)s) (cortex_ingester_memory_users) == 0)
(
(min by(%(alert_aggregation_labels)s, %(per_instance_label)s) (cortex_ingester_memory_users) == 0)
unless
(max by(%(alert_aggregation_labels)s, %(per_instance_label)s) (cortex_lifecycler_read_only) > 0)
)
and on (%(alert_aggregation_labels)s)
# Only if there are more timeseries than would be expected due to continuous testing load
(
Expand Down

0 comments on commit 6bf0b93

Please sign in to comment.