Protect services from overload by limiting request rates. Implement at multiple levels: API gateway, load balancer, application.
Strategies: reject excess requests ( response), queue requests, degrade gracefully (return cached data). Communicate limits via headers: X-RateLimit-Remaining, Retry-After. Required for multi-tenant systems.