Scenario: Service latency increased from ms to ms.
Resource debugging:
- CPU:
top,htop. Look for high user or system time - Memory:
free -m, check swap usage. OOM killer in logs? - Disk:
iostat,iotop. High await time = disk bottleneck - Network:
netstat,ss. Connection states, queue depths
Application debugging:
- Check slow queries (database)
- Check external API latencies
- Profile the application
- Check garbage collection pauses
Interview tip: Identify the bottleneck before optimizing. Is it CPU-bound, memory-bound, I/O-bound, or network-bound?