Backend Development 16 min read

Fundamentals and Common Practices of Rate Limiting in Distributed Systems

This article explains the basic concepts, dimensions, and typical algorithms of rate limiting, discusses various implementation strategies such as token bucket, leaky bucket, and sliding window, and reviews practical solutions using Nginx, Guava, Redis, Sentinel, and Tomcat for both single‑node and distributed environments.

Architect's Guide

Mar 21, 2023

Fundamentals and Common Practices of Rate Limiting in Distributed Systems

Rate limiting generally involves two dimensions: time (a time window such as per second or per minute) and resources (maximum number of requests or concurrent connections). Combining these dimensions, a rule may limit, for example, 100 requests per second.

QPS and Connection Control – Limits can be set per IP or per server, often combining multiple rules (e.g., each IP < 10 QPS, each server < 1000 QPS, total connections < 200). High‑level rules may apply to a whole server group or data center.

Transmission Rate – Rate limiting can be based on user groups, such as normal users limited to 100 KB/s and premium users to 10 MB/s.

Blacklist / Whitelist – Dynamic blacklists block IPs that exceed thresholds (e.g., bots or crawlers), while whitelists grant unrestricted access to trusted accounts.

Distributed Environment – In a cluster, rate‑limit data should be stored centrally so that all nodes share the same counters. Common approaches include gateway‑level limiting, middleware storing counters in Redis, and using Sentinel for distributed flow control.

Common Rate‑Limiting Algorithms

Token Bucket – Consists of tokens and a bucket. Tokens are generated at a fixed rate and stored up to a capacity. A request proceeds only if it can acquire a token; otherwise it is queued or dropped. Optional buffering queues can hold excess requests until new tokens appear.

Leaky Bucket – Requests are placed in a bucket and leak out at a constant rate, ensuring a steady output regardless of bursty input. When the bucket is full, incoming requests are discarded.

Sliding Window – Counts requests within a moving time window, providing smoother throttling as the window slides forward.

Typical Rate‑Limiting Solutions

Legal‑Verification Limiting – Captchas, IP blacklists, etc., to block malicious traffic.

Guava Limiting – Uses RateLimiter for single‑node throttling; not suitable for distributed scenarios.

Gateway‑Level Limiting – Applies rules at the entry point (e.g., Nginx, Spring Cloud Gateway, Zuul). Nginx provides two main methods:

Rate control using limit_req_zone (e.g., 2r/s with burst=4 for burst handling).

Concurrent connection control using limit_conn_zone and limit_conn (e.g., limit_conn perip 10, limit_conn perserver 100).

Middleware Limiting – Stores counters in a distributed store such as Redis. The time‑window algorithm can be implemented with Redis sorted sets, while the leaky‑bucket algorithm can be realized with Redis‑Cell scripts.

Rate‑Limiting Components – Open‑source solutions like Sentinel (Alibaba) provide rich APIs and a visual console for managing limits.

Architectural Considerations

In real projects, multiple limiting mechanisms are combined to form a layered strategy, from coarse gateway limits to fine‑grained middleware and component limits, ensuring high availability and optimal resource utilization.

Implementation Techniques

1) Tomcat – Set maxThreads in conf/server.xml to cap concurrent threads; excess requests are queued.

2) Nginx – Use limit_req_zone with burst for rate limiting, and limit_conn_zone / limit_conn for connection limiting.

3) Redis – Implement time‑window counters with sorted sets, leaky‑bucket with Redis‑Cell, and token‑bucket via Lua scripts.

4) Guava – Apply RateLimiter for single‑machine throttling.

Note that Redis‑based limits work in distributed systems, whereas Guava limits are limited to a single JVM.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Sliding Window Token Bucket leaky bucket

Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.