Comprehensive Guide to Rate Limiting: Concepts, Algorithms, and Implementations
This article explains the principles and practical implementations of rate limiting in backend systems, covering real‑world scenarios, strategies such as circuit breaking, service degradation, delayed and privileged handling, common algorithms like counter, leaky‑bucket and token‑bucket, and code examples using Guava and Nginx + Lua.
Rate Limiting Overview
Rate limiting is used to control traffic flow in both physical venues (e.g., tourist attractions) and online services, ensuring system availability by restricting the number of concurrent users or requests.
Limiting Strategies
Circuit Breaker
When a system encounters unrecoverable errors, a circuit breaker automatically rejects incoming traffic to prevent overload. Tools such as Hystrix and Alibaba Sentinel provide implementations.
Service Degradation
Non‑critical functionalities are temporarily disabled during traffic spikes, freeing resources for core services. Examples include disabling comments or points in e‑commerce platforms.
Delay Handling
Requests are buffered in a queue (leaky‑bucket concept) and processed sequentially, reducing immediate pressure on backend services.
Privilege Handling
Requests are classified, giving priority to high‑value users while delaying or rejecting others.
Differences Between Cache, Degradation, and Rate Limiting
Cache improves throughput, degradation shields the system when components fail, and rate limiting caps request rates when cache and degradation are insufficient.
Rate Limiting Algorithms
Counter Algorithm
A simple approach that limits the number of requests within a fixed time window (e.g., 100 requests per minute). When the count exceeds the limit, further requests are rejected.
<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>28.1-jre</version>
</dependency>Leaky Bucket Algorithm
Requests enter a bucket and are released at a constant rate; excess requests overflow, providing smooth traffic shaping.
Token Bucket Algorithm
Tokens are added to a bucket at a steady rate; a request proceeds only if a token is available, allowing bursts while maintaining an average rate.
Concurrency Limiting
System‑wide QPS thresholds are set (e.g., Tomcat’s maxThreads, maxConnections, acceptCount) to protect against sudden spikes.
Interface Limiting
Limits can be applied per API using fixed windows or sliding windows for more precise control.
Implementation Examples
Guava RateLimiter
LoadingCache<Long, AtomicLong> counter = CacheBuilder.newBuilder()
.expireAfterWrite(2, TimeUnit.SECONDS)
.build(new CacheLoader<Long, AtomicLong>() {
@Override
public AtomicLong load(Long second) throws Exception {
return new AtomicLong(0);
}
});
counter.get(1L).incrementAndGet();Token Bucket with Guava
public static void main(String[] args) {
RateLimiter limiter = RateLimiter.create(2); // 2 tokens per second
System.out.println(limiter.acquire());
Thread.sleep(2000);
System.out.println(limiter.acquire());
// ... additional acquire calls
}Distributed Limiting with Nginx + Lua
local locks = require "resty.lock"
function acquire()
local lock = locks:new("locks")
local elapsed, err = lock:lock("limit_key")
local limit_counter = ngx.shared.limit_counter
local key = "ip:" .. os.time()
local limit = 5
local current = limit_counter:get(key)
if current ~= nil and current + 1 > limit then
lock:unlock()
return 0
end
if current == nil then
limit_counter:set(key, 1, 1)
else
limit_counter:incr(key, 1)
end
lock:unlock()
return 1
end
ngx.print(acquire())These examples demonstrate how to apply rate limiting in single‑node and distributed environments, balancing availability and performance.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.