About OpsGenie #44
jlangy
announced in
Operations
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
OpsGenie is our tool for calling on-call staff if our system needs attention
Scheduling
Under the top Who is on call Tab you can view who is currently setup to be on shift. Any raised alerts will message out to this person first, and escalate to the rest of the team if the issue is not closed (See the alerts discussion for escalation times)
Routing And Escalation
By going into Teams -> (team-name) you can view the routing and escalation policies connected to that team.
Escalation
An escalation policy determines how long opsGenie will leave an alert open before calling different team members. The below example for critical events will notify on-call users immediately, and escalate to the entire team after 5 minutes if the issue is not closed or acknowledged.
Routing
Routing rules can be used to connect different alert types to a given escalation policy. In the below example, any alerts that are coming in as a "P1" will route to the critical escalation policy (see above).
IMPORTANT: To use these P1-P4 rules ensure you have setup your integration to map to them. See Sysdig Integration below
We have 4 sets of routes and escalations
Each has different times to escalate and can be adjusted separately.
Sysdig Integration
To integrate with Sysdig, the sysdig High, Medium and Low alerts can be mapped to different routes (P1, P3, P4). To see the setup for these go to Settings -> Configured Integrations -> Sysdig. Note: there is one for production and sandbox namespaces.
These alert filters are used to map the Severity level from Sysdig to the P1 - P4 levels used by our escalations and routes.
Beta Was this translation helpful? Give feedback.
All reactions