Many devops teams have incident alerting systems, but responding to the incident and getting a team together can take a while, and incident discussion can clutter up existing Mattermost channels. This is a simple recipe to help notify people that an incident occurred and keep discussion of the incident organized.
This is a very specific solution that will create a channel and invite some people on an incident trigger, update the channel header when the incident status changes, and close the channel and output resolution statistics to the Town Hall
Note: This code is mainly used to illustrate how to access the Mattermost API and connect it to a webhook, and should be considered a guide more than a production-ready application.
Instructions are available here
The code for this is open source and available here. It includes a webhook configuration, a Ruby script called from the webhook, and a sample config file as well as a small Mattermost API library to handle talking to your Mattermost server.
To create your own config files, make a copy of sample.hooks.json
and rename it to hooks.json
. Then edit the execute-command
and command-working-directory
to use the correct path.
Next, make a copy of sample.conf.yaml
and rename it conf.yaml
. Then edit the configuration to authenticate to your Mattermost server, and specify the team name and users who should be notified.
Finally, make sure you have webhook installed and run webhook -verbose
to start listening for the notifications.
To test that the webhook is working correctly, run this command from inside the PagerDuty Recipe directory:
$ curl -vX POST http://127.0.0.1:9000/hooks/pagerduty_hook -d @./test_data/incident.trigger.js --header "Content-Type: application/json"
Change trigger
to either acknowledge
or resolve
to use those incident states.
PagerDuty is a widely used alerting system, but any system that can send an outgoing webhook when an event is triggered could be used as a replacement.
Outgoing webhooks in PagerDuty are linked to monitoring services, but first you need to add it as an extension to that service. To get to them, go to Configuration
>Services
.
Then click the service name and click the Integrations
tab and click New Extension
and enter a name for your webhook and the URL to call, which should end with pagerduty_hook
to match the hooks.json
file
Next, trigger an alert in PagerDuty and acknowledge and resolve it. When the issue is created you’ll see a private channel created for the incident, with a header that shows the status of the ticket:
When the ticket status is updated, like with an acknowledgement, it updates the channel header to indicate the new status:
And when the issue is resolved, users are disinvited from the channel and the resolution is posted in Town Hall with some information, to let the whole team know the problem is fixed:
This recipe just shows a couple ways you can use a webhook and Mattermost to improve incident notification and organization. For example, on resolution of a ticket a script could get all the posts in the incident channel, as well as any files that were uploaded, and put them in an archive that’s attached to the incident resolution.
Because Mattermost supports interactive message buttons and slash commands you can also send hooks out of Mattermost. PagerDuty incidents could be acknowledged or a Jira ticket with the relevant incident information can be created without leaving your Mattermost client.
PagerDuty also supports other event types that you may want to handle differently, such as adding users to the incident channel when an incident is assigned to them.
Here’s where you can find everything you need to write your own Mattermost incident management system, based on PagerDuty: