Backend Development 13 min read

Optimizing High‑Concurrency Menu Services with Go sync.Pool Object Pool

This article explains the principles of object pools, details the internal implementation of Go's sync.Pool, and demonstrates through benchmarks how using an object pool can dramatically reduce memory allocation and latency for high‑traffic menu services in a restaurant ordering application.

Yum! Tech Team
Yum! Tech Team
Yum! Tech Team
Optimizing High‑Concurrency Menu Services with Go sync.Pool Object Pool

1. Introduction In modern digital platforms, online services must handle increasing real‑time traffic, making high concurrency and performance essential. In a restaurant ordering app, each menu request creates a new object, leading to heavy memory and CPU overhead as traffic grows.

2. Object Pool Basics An object pool pre‑allocates a set of instances; when needed, a thread checks one out, and after use the object is returned to the pool, avoiding frequent allocations. The article focuses on Go's sync.Pool implementation, which uses a linked‑list ( poolChain ) combined with a ring buffer.

2.1 Implementation Details

type poolChain struct {
    // head is the poolDequeue to push to. This is only accessed
    // by the producer, so doesn't need to be synchronized.
    head *poolChainElt
    // tail is the poolDequeue to popTail from. This is accessed
    // by consumers, so reads and writes must be atomic.
    tail *poolChainElt
}

type poolChainElt struct {
    poolDequeue
    // next and prev link to the adjacent poolChainElts in this
    // poolChain.
    //
    // next is written atomically by the producer and read
    // atomically by the consumer. It only transitions from nil to
    // non‑nil.
    //
    // prev is written atomically by the consumer and read
    // atomically by the producer. It only transitions from
    // non‑nil to nil.
    next, prev *poolChainElt
}

type poolDequeue struct {
    // headTail packs together a 32‑bit head index and a 32‑bit tail index.
    // Both are indexes into vals modulo len(vals)-1.
    //
    // tail = index of oldest data in queue
    // head = index of next slot to fill
    //
    // Slots in the range [tail, head) are owned by consumers.
    // A consumer continues to own a slot outside this range until
    // it nils the slot, at which point ownership passes to the
    // producer.
    //
    // The head index is stored in the most‑significant bits so that we can atomically add to it and the overflow is harmless.
    headTail uint64
    // vals is a ring buffer of interface{} values stored in this dequeue.
    // The size of this must be a power of 2.
    //
    // vals[i].typ is nil if the slot is empty and non‑nil otherwise.
    // A slot is still in use until *both* the tail index has moved beyond it and typ has been set to nil. This
    // is set to nil atomically by the consumer and read atomically by the producer.
    vals []eface
}

The core of sync.Pool is the poolChain linked list plus a ring buffer, providing lock‑free access via atomic operations on a combined headTail 64‑bit value.

2.2 Lock‑Free Operations The pool uses atomic.CompareAndSwapUint64 to update headTail . Incrementing the head pointer is done with a left‑shifted add:

const dequeueBits = 32
atomic.AddUint64(&d.headTail, 1<<dequeueBits)

These atomic CAS operations ensure that only one goroutine succeeds in updating the shared state, eliminating the need for mutexes.

2.3 Benchmark Two benchmark tests compare object creation with and without a pool. The pool version reuses NInfo objects, resetting them via an init method before returning to the pool.

package tmp
import (
    "sync"
    "testing"
    "time"
)

var pool = sync.Pool{New: func() interface{} { return new(NInfo) }}

type NInfo struct { name, local, content string }

func (y *NInfo) init() { y.name, y.local, y.content = "", "", "" }

func newNObj() *NInfo { time.Sleep(100 * time.Millisecond); return new(NInfo) }

func BenchmarkWrite(b *testing.B) {
    var p *NInfo
    for n := 0; n < b.N; n++ {
        for j := 0; j < 20000; j++ { p = newNObj(); p.name = "旅游景区"; p.local = "南京市区"; p.content = "夫子庙、紫荆山、钟山陵" }
    }
}

func BenchmarkWriteWithPool(b *testing.B) {
    for n := 0; n < b.N; n++ {
        for j := 0; j < 20000; j++ {
            p := pool.Get().(*NInfo)
            p.name = "旅游景区"
            p.local = "南京市区"
            p.content = "夫子庙、紫荆山、钟山陵"
            p.init()
            pool.Put(p)
        }
    }
}

Results show memory usage dropping from ~480 KB per operation to near 0, and latency improving from ~108 ms/op to ~14 ns/op, confirming the efficiency of the pool.

2.4 Application Scenarios Object pools are useful for connection pools, thread pools, and any high‑frequency object creation pattern where reuse can cut allocation overhead.

2.5 Practical Use in Menu Service In the menu API, shared (common) data is stored locally while store‑specific data resides in Redis. High traffic caused frequent GC pauses due to large per‑request allocations. Two mitigation strategies were considered: refactor data structures or employ an object pool. The pool approach was chosen and integrated.

After applying the pool, GC pressure and CPU usage decreased dramatically, as shown by flame‑graph analysis where the clone operation previously dominated CPU time.

3. Conclusion Reducing memory allocations, shortening object lifetimes, and using lock‑free structures like sync.Pool are key techniques for building high‑performance, low‑latency services in Go.

Performancememory managementGohigh concurrencybenchmarksync.PoolObject pool
Yum! Tech Team
Written by

Yum! Tech Team

How we support the digital platform of China's largest restaurant group—technology behind hundreds of millions of consumers and over 12,000 stores.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.