Indident Reports - a template
Bad things happen, the important thing is how do we deal with them when they do, and the actions we take so they don’t happen again.
Today I would like to talk about Incident reports. A way for the technical / engineering staff to show the business we care about our platform, understand the problems caused and are willing to move ahead and be better next time. Showing the lessons learned from this mistake, in order to avoid getting bitten by it again.
I have had to do two of these in the last 16 months and are always welcomed by the business. It helps them under stand the issues we have, the impact on the customers / business and how we are working to correcting our wrongs.
The template below is circulated to the appropriate people in a Google Doc or Word documents prior to retrospectives or postmortems. Hope its useful to you 🤓
The Template
Description | An overview of the problem |
Date | Date of problem |
Duration | 1min - hours - days it occured for |
Identified | How did you find it |
Manifestation | When doing X in the website the users experience Y |
Financial impact | Outage caused 10 dollars worth of revenue loss |
Customer impact | Bad customer experience because… |
Next steps | How will we stop this from happening again? |
Details
Business language explanation of how the problem manifested and why.
Impact
What was the impact to the business / customers
Root cause of failure
What make this bad thing happen?
Moving forwards
What will we do in the future to stop this?
- Improve dev process?
- More tests?
- Better alerting? etc