Average latency hides problems. Use percentiles instead:
P50 (median): Half of requests are faster than this.
P95: % of requests are faster. % are slower.
P99: % of requests are faster. % are slower.
P99.9: The slowest % of requests.
Why this matters: If your P50 is ms but P99 is seconds, % of your users have a terrible experience. At million requests/day, that's frustrated users.
In interviews, always ask: "What's the latency requirement?" Then clarify: "Is that P50 or P99?"