Scenario: Users report the site is completely down.
Systematic approach:
Verify: Can you reproduce? All users or some?
Recent changes: Deployments, config changes?
Dependencies: Database, cache, external APIs
Infrastructure: DNS, load balancers, network
Application: Error logs, resource exhaustion
Common causes:
- Bad deployment (rollback first)
- Database connection exhaustion
- DNS misconfiguration
- Certificate expiration
Interview tip: Always ask "What changed recently?" Changes cause most outages.