Backend Development 16 min read

Fundamentals and Common Practices of Rate Limiting in Distributed Systems

This article explains the basic concepts, dimensions, and typical algorithms of rate limiting, discusses various implementation strategies such as token bucket, leaky bucket, and sliding window, and reviews practical solutions using Nginx, Guava, Redis, Sentinel, and Tomcat for both single‑node and distributed environments.

Architect's Guide
Architect's Guide
Architect's Guide
Fundamentals and Common Practices of Rate Limiting in Distributed Systems

Rate limiting generally involves two dimensions: time (a time window such as per second or per minute) and resources (maximum number of requests or concurrent connections). Combining these dimensions, a rule may limit, for example, 100 requests per second.

QPS and Connection Control – Limits can be set per IP or per server, often combining multiple rules (e.g., each IP < 10 QPS, each server < 1000 QPS, total connections < 200). High‑level rules may apply to a whole server group or data center.

Transmission Rate – Rate limiting can be based on user groups, such as normal users limited to 100 KB/s and premium users to 10 MB/s.

Blacklist / Whitelist – Dynamic blacklists block IPs that exceed thresholds (e.g., bots or crawlers), while whitelists grant unrestricted access to trusted accounts.

Distributed Environment – In a cluster, rate‑limit data should be stored centrally so that all nodes share the same counters. Common approaches include gateway‑level limiting, middleware storing counters in Redis, and using Sentinel for distributed flow control.

Common Rate‑Limiting Algorithms

Token Bucket – Consists of tokens and a bucket. Tokens are generated at a fixed rate and stored up to a capacity. A request proceeds only if it can acquire a token; otherwise it is queued or dropped. Optional buffering queues can hold excess requests until new tokens appear.

Leaky Bucket – Requests are placed in a bucket and leak out at a constant rate, ensuring a steady output regardless of bursty input. When the bucket is full, incoming requests are discarded.

Sliding Window – Counts requests within a moving time window, providing smoother throttling as the window slides forward.

Typical Rate‑Limiting Solutions

Legal‑Verification Limiting – Captchas, IP blacklists, etc., to block malicious traffic.

Guava Limiting – Uses RateLimiter for single‑node throttling; not suitable for distributed scenarios.

Gateway‑Level Limiting – Applies rules at the entry point (e.g., Nginx, Spring Cloud Gateway, Zuul). Nginx provides two main methods:

Rate control using limit_req_zone (e.g., 2r/s with burst=4 for burst handling).

Concurrent connection control using limit_conn_zone and limit_conn (e.g., limit_conn perip 10 , limit_conn perserver 100 ).

Middleware Limiting – Stores counters in a distributed store such as Redis. The time‑window algorithm can be implemented with Redis sorted sets, while the leaky‑bucket algorithm can be realized with Redis‑Cell scripts.

Rate‑Limiting Components – Open‑source solutions like Sentinel (Alibaba) provide rich APIs and a visual console for managing limits.

Architectural Considerations

In real projects, multiple limiting mechanisms are combined to form a layered strategy, from coarse gateway limits to fine‑grained middleware and component limits, ensuring high availability and optimal resource utilization.

Implementation Techniques

1) Tomcat – Set maxThreads in conf/server.xml to cap concurrent threads; excess requests are queued.

2) Nginx – Use limit_req_zone with burst for rate limiting, and limit_conn_zone / limit_conn for connection limiting.

3) Redis – Implement time‑window counters with sorted sets, leaky‑bucket with Redis‑Cell, and token‑bucket via Lua scripts.

4) Guava – Apply RateLimiter for single‑machine throttling.

Note that Redis‑based limits work in distributed systems, whereas Guava limits are limited to a single JVM.

backendRedisNginxRate LimitingSliding Windowtoken bucketleaky bucket
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.