Skip to content

Commit

Permalink
Merge pull request #343 from DFE-Digital/api-metric-alerts
Browse files Browse the repository at this point in the history
Additional API metric alerts
  • Loading branch information
ethax-ross authored Nov 2, 2020
2 parents 69f6c87 + f60b49e commit 067e660
Showing 1 changed file with 44 additions and 0 deletions.
44 changes: 44 additions & 0 deletions monitoring/prometheus/alert.rules
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,47 @@ groups:
severity: high
annotations:
summary: Alert when any client hits a rate limit.
- alert: FailedJobs
expr: 'sum(increase(api_hangfire_jobs{state="failed"}[1m])) > 0'
labels:
severity: high
annotations:
summary: Alert when any job fails.
- alert: HighGoogleApiCalls
expr: 'sum(increase(api_google_api_calls[10m])) > 5'
labels:
severity: high
annotations:
summary: Alert when a high number of Google API calls are made (>5 in a 10m period).
- alert: GoogleApiErrors
expr: 'sum(rate(api_google_api_calls{result != "success"}[1m])) > 0'
labels:
severity: medium
annotations:
summary: Alert when the Google API returns a non-success response.
- alert: ClientApproachingRateLimit
expr: 'sum(increase(http_request_duration_seconds_sum{controller=~"Candidates|MailingList|TeachingEvents",action=~"CreateAccessToken|AddMember|AddAttendee|SignUp",code=~".+"}[1m])) by (controller, action) > 15'
labels:
severity: medium
annotations:
summary: Alert when a client is approaching the rate limit threshold (15rpm out of an available 30rpm).
- alert: HighCpu
expr: 'max(cpu_percent) > 70'
labels:
severity: medium
annotations:
summary: Alert when max CPU utilization is over 70%.
- alert: HighMemory
expr: 'dotnet_total_memory_bytes > 256000000'
labels:
severity: medium
annotations:
summary: Alert when max memory utilization is over 256MB.
- alert: HighDatabaseConnections
expr: 'max(connections) > 75'
labels:
severity: medium
annotations:
summary: Alert when max database connections exceeds 75 (out of an available 100).


0 comments on commit 067e660

Please sign in to comment.