fix(outputs.azure_monitor): Prevent infinite send loop for outdated metrics #16448
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
The Azure Monitor service has limitations on the time-range of accepted metrics. Metrics older than 30 minutes and more than 4 minutes into the future (or 20 min in the past and 5 min in the future according to documentation) are not accepted and the service responds with a
400
HTTP error code. The current code propagates this error up into the model and the model code then assumes it must reschedule the batch or metrics due to a retryable error. Usually metrics are filtered according to those limits but due to latency during transmission etc. the error might still be triggered.This PR handles the
400
HTTP error code as a non-retryable error and fixes other non-retryable errors paths on the way.It furthermore introduces two new configuration settings to better control the accepted metric timeframe allowing a more robust limit for filtering the metrics. Please note: The default values of the parameters (30 min into the past until 1 minute into the past) are representing the previous behavior, where the upper limit is in the past (contrary to 4 min into the future) to allow aggregation of metrics to happen before sending those aggregates.
Checklist
Related issues
resolves #15908