Backend Development 26 min read

An Introduction to Rate Limiting: Concepts, Classifications, and Go Implementation

This article explains the fundamentals of rate limiting, its importance for high‑availability services, various classification dimensions, common algorithms such as fixed‑window, sliding‑window, leaky‑bucket and token‑bucket, and demonstrates practical usage with Go's golang.org/x/time/rate library including code examples and configuration tips.

Top Architect

Feb 14, 2021

An Introduction to Rate Limiting: Concepts, Classifications, and Go Implementation

Rate Limiting Overview

Rate limiting (or flow control) restricts the number of events that can enter a system within a given time window, protecting services from overload, ensuring stable response times, and preventing cascading failures. It differs from circuit breaking, which is typically client‑side, while rate limiting is implemented on the server side.

Classification of Rate Limiting

Granularity

Single‑node (per‑service) rate limiting

Distributed rate limiting (e.g., NGINX + Redis, gateway clusters)

Single‑node limits traffic at an individual service instance, while distributed approaches coordinate limits across multiple nodes using a shared store, usually Redis, to achieve global consistency.

Target Types

Request‑based limiting (e.g., QPS, total request count)

Resource‑based limiting (e.g., TCP connections, threads, memory usage)

Request‑based limits control the number of incoming calls, whereas resource‑based limits protect critical system resources.

Algorithm Types

Fixed‑window counter

Sliding‑window counter

Leaky‑bucket

Token‑bucket

Fixed‑Window Counter

The simplest algorithm maintains a counter for a fixed time interval; when the interval expires the counter resets.

package limit

import (
    "sync/atomic"
    "time"
)

type Counter struct {
    Count       uint64 // current count
    Limit       uint64 // max requests per interval
    Interval    int64  // interval in ms
    RefreshTime int64  // start of current window
}

func NewCounter(count, limit uint64, interval, rt int64) *Counter {
    return &Counter{Count: count, Limit: limit, Interval: interval, RefreshTime: rt}
}

func (c *Counter) RateLimit() bool {
    now := time.Now().UnixNano() / 1e6
    if now < (c.RefreshTime + c.Interval) {
        atomic.AddUint64(&c.Count, 1)
        return c.Count <= c.Limit
    } else {
        c.RefreshTime = now
        atomic.AddUint64(&c.Count, -c.Count)
        return true
    }
}

This method is easy but suffers from burstiness and uneven distribution within the window.

Sliding‑Window Counter

The sliding window divides the interval into many small slots, each with its own counter, and slides the window forward, aggregating counts across slots to provide smoother limiting.

Leaky‑Bucket

Requests enter a fixed‑size queue (the bucket) and are released at a constant rate; excess requests are dropped, smoothing traffic but potentially increasing latency for bursts.

Token‑Bucket

Tokens are added to a bucket at a steady rate; each request consumes a token. If the bucket is empty, the request is rejected. This algorithm allows bursts up to the bucket capacity while enforcing an average rate.

Go Rate‑Limiting Library (golang.org/x/time/rate)

The Go standard library provides a token‑bucket implementation via rate.NewLimiter. It accepts a Limit (events per second) and a burst size (maximum tokens).

func NewLimiter(r Limit, b int) *Limiter

Example:

limiter := rate.NewLimiter(10, 5) // 10 events/sec, burst up to 5

The library offers three families of methods:

Allow / AllowN : non‑blocking checks; return false if the request would exceed the limit.

Wait / WaitN : block until enough tokens are available or the provided context expires.

Reserve / ReserveN : reserve tokens for future use, returning a Reservation that can be inspected, delayed, or cancelled.

Sample usage of Allow:

func AllowDemo() {
    limiter := rate.NewLimiter(rate.Every(200*time.Millisecond), 5)
    for i := 1; i <= 15; i++ {
        if limiter.Allow() {
            fmt.Println(i, "====Allow====", time.Now())
        } else {
            fmt.Println(i, "====Disallow====", time.Now())
        }
        time.Sleep(80 * time.Millisecond)
    }
}

Sample usage of WaitN with a timeout context:

func WaitNDemo() {
    limiter := rate.NewLimiter(10, 5)
    for i := 1; i <= 10; i++ {
        ctx, cancel := context.WithTimeout(context.Background(), 400*time.Millisecond)
        err := limiter.WaitN(ctx, 4)
        if err != nil {
            fmt.Println("error:", err)
            cancel()
            continue
        }
        fmt.Println(i, "executed at", time.Now())
        cancel()
    }
}

Sample usage of ReserveN to obtain a delay before execution:

func ReserveNDemo() {
    limiter := rate.NewLimiter(10, 5)
    for i := 1; i <= 10; i++ {
        r := limiter.ReserveN(time.Now(), 4)
        if !r.OK() { return }
        time.Sleep(r.Delay())
        fmt.Println("executed:", time.Now())
    }
}

The limiter also supports dynamic adjustments via SetBurst, SetBurstAt, SetLimit, and SetLimitAt, allowing services to adapt limits based on real‑time metrics such as QPS, CPU usage, or latency.

Choosing the Right Strategy

Fixed‑window: simple, suitable for emergency stop‑gap measures.

Sliding‑window: handles moderate bursts with low implementation cost.

Leaky‑bucket: enforces smooth output, good for uniform traffic requirements.

Token‑bucket: best for systems expecting occasional spikes while maintaining high throughput.

Conclusion

Rate limiting is a crucial component of service governance. Understanding its classifications, algorithms, and practical Go implementations helps developers design resilient, high‑performance back‑end systems that can gracefully handle traffic surges and protect shared resources.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems algorithm Golang Rate Limiting Token Bucket

Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.