You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have a managed OpenSearch domain that sporadically doesn't report the FreeStorageSpace. According to AWS Support, this can occur because of an update to the Internal agent responsible for publishing the CloudWatch metrics for the domain
Notice the alarm triggers due to this missing data point, even though the disk usage threshold of the domain is below the threshold. This is because of the metric math used for this alarm, which is 100 * (used/(used+free)), becomes 100 * (used/(used+0)) = 100 * (used/used) = 100
Expected behavior
The alarm should not trigger, as the domain is not actually exceeding the disk usage threshold.
Actual behavior
The alarm triggers.
Other details
According to AWS Support, the FreeStorageSpace metric can be sporadically missing data points because of an update to the Internal agent responsible for publishing the CloudWatch metrics for the domain.
I examined the domain and could observe that the missing data point was because of an update to the Internal agent responsible for publishing the CloudWatch metrics for the domain and because of which the FreeStorageMetrics was not reported for 18:51, 18:52 UTC. Further, nothing can be done from your end to avoid this missing data point, however in case in future if you observe any scenario where the metric miss the data point for a extended duration, please do let us know and we can look into this further.
As a workaround, the number of data points to alarm on can be increased, as for the two examples we've seen, there were only one or two missing data points at a time.
It may also be worth considering adding alarms for continued missing metrics.
The text was updated successfully, but these errors were encountered:
echeung-amzn
changed the title
Disk usage alarm can trigger for managed OpenSearch domain due to missing data points
[opensearch] Disk usage alarm can trigger for managed OpenSearch domain due to missing data points
Dec 6, 2024
Version
8.3.3
Steps and/or minimal code example to reproduce
100 * (used/(used+free))
, becomes100 * (used/(used+0)) = 100 * (used/used) = 100
Expected behavior
The alarm should not trigger, as the domain is not actually exceeding the disk usage threshold.
Actual behavior
The alarm triggers.
Other details
According to AWS Support, the FreeStorageSpace metric can be sporadically missing data points because of an update to the Internal agent responsible for publishing the CloudWatch metrics for the domain.
As a workaround, the number of data points to alarm on can be increased, as for the two examples we've seen, there were only one or two missing data points at a time.
It may also be worth considering adding alarms for continued missing metrics.
The text was updated successfully, but these errors were encountered: