You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I am running alertmanager with sharding enabled with 3 pods running and I'm experiencing long delays between creating silences and it reflecting that in the alert UI/API (~15min) and vice versa when expiring the silence.
For example,
I create a silence from the UI/API
The effected alerts still appear in the UI and API response for quite a long time.
Eventually the effected alerts are removed from the list of active alerts.
Where as I expected it would almost instantly reflect the changes.
To Reproduce
Steps to reproduce the behavior:
Start Cortex v1.17.1 (With the config defined in 'Additional Context'
Navigate to the alertmanager UI
Send a test alert (can be anything)
Create a silence, matching one of the labels in the test alert.
Navigate back to the Alerts page
Confirm that the alert is still showing, even after creating the silence.
Expected behavior
When creating a silence, I except the matched alerts to be silenced almost immediately. Instead it takes a few minutes for the alerts to be changed from active to silenced.
Deployed as a statefulset on kubernetes, running on 3+ replicas.
I've noticed while looking at the logs, the silences tend to actually silence the alerts after the silences Maintenance is done on all replicas. Looking at the code, the maintenance period is hardcoded to 15min.
Also, I have tried changing various configs such as poll_interval, push_pull_interval, persist_interval, grpc_compression, gc_interval without much luck. I have tried consul as the kvstore as well. Seemed to make no difference.
The text was updated successfully, but these errors were encountered:
Describe the bug
I am running alertmanager with sharding enabled with 3 pods running and I'm experiencing long delays between creating silences and it reflecting that in the alert UI/API (~15min) and vice versa when expiring the silence.
For example,
Where as I expected it would almost instantly reflect the changes.
To Reproduce
Steps to reproduce the behavior:
I've attached a video of the test scenario.
https://github.com/user-attachments/assets/cba15fa8-a2fd-4ed5-ad71-01207a035727
Expected behavior
When creating a silence, I except the matched alerts to be silenced almost immediately. Instead it takes a few minutes for the alerts to be changed from active to silenced.
Environment:
Additional Context
Config Used:
Deployed as a statefulset on kubernetes, running on 3+ replicas.
I've noticed while looking at the logs, the silences tend to actually silence the alerts after the silences Maintenance is done on all replicas. Looking at the code, the maintenance period is hardcoded to 15min.
Also, I have tried changing various configs such as poll_interval, push_pull_interval, persist_interval, grpc_compression, gc_interval without much luck. I have tried consul as the kvstore as well. Seemed to make no difference.
The text was updated successfully, but these errors were encountered: