Software Reliability Engineering
Resources
- Sloth - Prometheus SLO (service level objectives) generator
- Google SRE
- How SRE teams are organized, and how to get started
- Ask HN: Best “I brought down production” story?
- DevOps, SRE, and Platform Engineering
- Eliminating Toil
- Why Twitter Didn’t Go Down: From a Real Twitter SRE
- How to Build Software like an SRE
- Move past incident response to reliability
- Incident categories I’d like to see
- Counting Forest Fires - about success measures for incidents.