Rate Limiting | Glossary

Rate limiting is controlling the number of requests a client can make to a service in a time period. Prevents abuse and guarantees fair resource allocation. A public API might allow 100 requests per minute per client. Exceed the limit and requests are rejected. This protects against denial-of-service attacks where someone floods your service with requests.

It guarantees that one client using the API excessively doesn't degrade experience for others. Token bucket is a common rate limiting algorithm. Each client gets a bucket that fills tokens at a fixed rate. Each request consumes a token. If the bucket is empty, requests are rejected. This allows bursts, use multiple tokens at once, but enforces an average rate. Sliding window is another approach.

Count requests in a sliding time window. If the count exceeds the limit, reject requests. Rate limiting is required for public APIs. Without it, abuse is trivial. With it, you protect your service while enabling legitimate use. Different APIs have different rate limits. AWS API Gateway might allow 1000 requests per second.

GitHub API allows 60 requests per minute unauthenticated, 5000 authenticated. Rate limits are part of API design.

Interactive Concept: rate limiting

Rate Limiting Visualizer

Explore how rate limiting controls API requests using a token bucket algorithm. Adjust limits and send requests to see protection in action.

Rate Limit Configuration

Rate Limit: 5 requests

Time Window: 10 seconds

Token Bucket

Available tokens: 5/5

Request Controls

Allowed

Blocked

Request Timeline

No requests yet. Send some requests to see the timeline.

Related Terms

API (Application Programming Interface)