Load balancers can protect backends from excessive requests.
Rate limiting caps requests per client. Example: requests per minute per IP address. Excess requests get Too Many Requests.
Connection limiting caps concurrent connections per client. Example: simultaneous connections per IP. This prevents one client from monopolizing resources.
Implement rate limiting at the load balancer, not in each backend. You get consistent enforcement and protect backends from even seeing excessive traffic.