Post-mortems turn incidents into improvements. Blameless culture is non-negotiable.
Post-mortem structure:
- Summary: What happened, impact, duration
- Timeline: Detailed sequence of events
- Root cause: Why it happened (not who)
- Impact: Users affected, revenue impact
- Action items: Specific, assigned, with deadlines
Blameless principles:
- People make mistakes because systems allow them
- Focus on "what" and "why," never "who"
- Assume good intentions
Interview tip: Gather timeline from logs. Identify root cause. Create assigned action items. Share for learning.