Post-incident Reviews

Lectures 96 • 40 slides

Narration

Session 96 Slide 1: Building an On-Call System

Phase 7: Troubleshooting and Incident Management

mindmap root((On-Call System)) Rotation Design Tool Introduction PagerDuty Opsgenie Escalation 24/7 Support

Course Overview

  • Building a system to support 24/7 service operation
  • Designing a sustainable on-call system

Learning Objectives for Session 96

  • What is on-call?
  • How to think about rotation design
  • Introduction to PagerDuty and Opsgenie
  • Escalation flow

Why is an on-call system necessary?

  • Failures can happen at any time
  • Quick initial response minimizes damage
  • Maintaining user trust
1/40

Apps - Try Now

SaaS web services and mobile apps from Yamashin Research Lab.

View Apps