Backend Development 9 min read

Understanding Service Rate Limiting: Algorithms and Distributed Implementation

This article explains why service rate limiting is essential for high‑concurrency systems, compares it with service degradation, introduces common limiting algorithms such as counter, leaky‑bucket and token‑bucket, and shows how to implement distributed rate limiting using Redis for real‑world backend services.

Top Architect

Jan 26, 2022

Understanding Service Rate Limiting: Algorithms and Distributed Implementation

When building high‑concurrency systems, stability is usually ensured through three mechanisms: service degradation, rate limiting, and caching. Among them, rate limiting is the focus of this article.

Difference Between Service Degradation and Rate Limiting

Service degradation shuts down certain interfaces or pages under sudden load to free resources for core tasks.

Rate limiting reduces incoming traffic while keeping the service’s processing capacity unchanged; degradation reduces the service’s capacity and reallocates resources.

The article emphasizes that the discussed rate limiting is implemented in business code rather than at the Nginx layer.

Why Do We Need Rate Limiting?

From the caller’s perspective, services can be user‑facing (web, external APIs) or internal RPC services. Factors that can cause traffic spikes include rapid user growth, hot events, malicious attacks or scraping, and unpredictable spikes that make scaling impossible in time.

Common Rate‑Limiting Algorithms

The article reviews three widely used algorithms:

Counter Algorithm

The simplest method counts requests within a time window (e.g., 100 QPS in 1 s) and rejects excess requests. Its drawbacks include unfair rejection when many requests arrive early in the window and vulnerability to burst attacks that exploit window reset points.

A classic improvement is the sliding‑window technique, which divides a larger window into smaller sub‑windows to smooth out bursts.

Leaky‑Bucket Algorithm

This algorithm models a bucket that drains at a constant rate; excess requests overflow and are dropped. It guarantees a steady processing rate but cannot handle short‑term traffic spikes.

Token‑Bucket Algorithm

A bucket holds a limited number of tokens that are replenished at a fixed rate. A request proceeds only after acquiring a token, allowing short bursts when tokens have accumulated, thus addressing the leaky‑bucket’s limitation.

From Single‑Machine to Distributed Rate Limiting

Single‑machine limiting often cannot satisfy complex business requirements such as limiting a user to three accesses per 10 seconds or 100 accesses per day. Distributed limiting solves this by using a shared counter, typically backed by Redis, which offers native support for distributed operations.

Implementation sketch: for each request, generate a Redis key based on user ID and API name, issue an INCR command, and set an expiration on the key to enforce the time window.

The article concludes with a curated list of interview questions from major tech companies (BAT) and invites readers to scan QR codes for additional resources.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems algorithm service degradation Rate Limiting

Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.