diff --git a/documentation/meta/monitoring/runbooks/nuxt_avg_response_time_above_threshold.md b/documentation/meta/monitoring/runbooks/nuxt_avg_response_time_above_threshold.md index 2fdd8de929a..c326ca45469 100644 --- a/documentation/meta/monitoring/runbooks/nuxt_avg_response_time_above_threshold.md +++ b/documentation/meta/monitoring/runbooks/nuxt_avg_response_time_above_threshold.md @@ -20,12 +20,34 @@ previous version. Otherwise, check the following, in order: 2. Check if dependencies like the API or Plausible analytics are constrained. If stable, move on. +To gather more information check the [log group][log_group], use the "Logs +Insights" view to query for requests that may be taking longer than expected +with a CloudWatch query similar to the following which can give more hints about +which routes are causing increased response times. Occasionally the `/api/event` +endpoint will take longer to respond (due to upstream issues with Plausible), +and these cases will increase our average response time while not actually +affecting frontend performance for users. The following query shows the top 10 +routes where the request took longer than 0.5 seconds grouped by number of +requests made to that route. + +``` +fields request, request_time, @timestamp, @message +| filter request_time > 0.5 +| stats count(*) as request_count by request +| sort request_count desc +| limit 10 +``` + [traffic_runbook]: /meta/monitoring/traffic/runbooks/identifying-and-blocking-traffic-anomalies.md +[log_group]: + https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:log-groups/log-group/$252Fecs$252Fproduction$252Ffrontend ## Historical false positives -Nothing registered to date. +- _2024-04-10, 22:00 UTC_: Requests to the `/api/event` endpoint were taking + longer than expected and impacted average response time, but not normal + frontend traffic response time. ## Related incident reports