Getting SRE Buy-in from a Manager or Lead for Incident Response

The situation

The incentives

  • Incident management best practices restore your systems as fast as possible when an incident occurs.
  • A playbook gives everyone a sense of control amidst the chaos. It defines a set of repeatable practices to drive consistency while helping everyone to be thorough with their problem-solving.
  • Measuring time to resolution (TTR) and time to detection (TTD) allows the manager to quantify the team’s improvement on TTR and TTD moving forward.
  • Integration with alerting and ticketing systems reduces context switching between different apps. This lowers the stress from mentally keeping track of many systems.

The resistance

The emotional appeal

The logical appeal

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Blameless

Blameless

Giving you all you need to know about Site Reliability Engineering. https://www.blameless.com/blog/