Backend Development 9 min read

Understanding Rate Limiting: Counter, Sliding Window, Leaky Bucket, and Token Bucket Algorithms

The article introduces the concept of rate limiting, explains why it is needed both offline (e.g., crowded theme parks) and online (e.g., flash sales), and details four common algorithms—counter, sliding window, leaky bucket, and token bucket—along with their implementation considerations and trade‑offs.

Full-Stack Internet Architecture
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Understanding Rate Limiting: Counter, Sliding Window, Leaky Bucket, and Token Bucket Algorithms

During the National Day holiday the author visited Shanghai Disneyland with his girlfriend, only to encounter massive crowds that made the experience chaotic, illustrating the real‑world problem of uncontrolled traffic.

Just as a theme park needs to limit the number of visitors, online systems must enforce flow control to protect core services during traffic spikes such as Spring Festival train ticket rushes, Double‑11 shopping festivals, or flash sales.

🌴 What Is Rate Limiting

Definition: Rate limiting restricts the number of concurrent requests reaching a system, ensuring it can respond to a subset of users while rejecting excess traffic to maintain overall availability.

🌴 Counter Rate Limiting

This simple algorithm increments a counter for each incoming request and compares the value against a threshold; if the limit is exceeded, the request is rejected. It suffers from a “critical point” issue where bursts can temporarily exceed capacity.

🌴 Sliding Window Rate Limiting

To solve the counter’s precision problem, the sliding window divides the time interval into smaller slots (e.g., 1‑second windows within a 5‑second period) and sums the counts of the most recent slots, providing smoother and more accurate throttling.

🌴 Leaky Bucket Rate Limiting

A leaky bucket placed between the traffic producer and consumer buffers incoming requests; excess bursts are dropped, and the bucket releases requests at a fixed rate, smoothing traffic and preventing sudden overloads.

🌴 Token Bucket Rate Limiting

Tokens are added to a bucket at a constant rate; each request must consume a token before proceeding. If the bucket is empty, the request is delayed or rejected. This algorithm allows bursts while enforcing an average rate, and can be implemented with in‑memory counters, Redis, or libraries such as Guava’s RateLimiter .

In practice, choosing the right algorithm and configuring limits (static or dynamic) requires careful testing and monitoring to balance protection and user experience.

backendRate LimitingSliding Windowtoken bucketleaky bucketcounter algorithm
Full-Stack Internet Architecture
Written by

Full-Stack Internet Architecture

Introducing full-stack Internet architecture technologies centered on Java

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.