Backend Development 27 min read

Performance Testing of Java and Go High‑Performance Message Queues Using LinkedBlockingQueue

This article presents a detailed performance evaluation of Java and Go high‑throughput message queues, focusing on LinkedBlockingQueue, exploring test scenarios based on message size and thread count, analyzing producer and consumer results, providing benchmark data, and sharing Groovy test cases for reproducibility.

FunTester

Jan 10, 2022

Performance Testing of Java and Go High‑Performance Message Queues Using LinkedBlockingQueue

Conclusion

Overall, java.util.concurrent.LinkedBlockingQueue can sustain around 500k QPS, meeting current load‑testing needs, but its performance becomes unstable when the queue grows long. Three practical recommendations are: keep message bodies small, limit the benefit of adding more threads, and avoid queue backlog.

Introduction

After publishing articles on Disruptor and a ten‑million‑level log replay engine, I prepared performance tests for several high‑performance message queues in Java and Go, selecting benchmark scenarios and application cases.

The test scenario design considers two aspects: message body size (distinguished by different GET request sizes) and the number of producer/consumer threads (called goroutine in Go).

Note: In subsequent Go articles, "thread" always refers to a goroutine.

Object Overview

The first tested object is java.util.concurrent.LinkedBlockingQueue, a linked‑node based optionally‑bounded blocking queue that follows FIFO ordering. The official definition states that linked queues usually provide higher throughput than array‑based queues but can exhibit less predictable performance in many concurrent applications.

Among JDK‑provided queue implementations, LinkedBlockingQueue shows the best performance, with ArrayBlockingQueue as a secondary candidate. Reported data suggests that LinkedBlockingQueue is roughly 2‑3 times faster than ArrayBlockingQueue.

Test Results

Performance is measured solely by the number of messages processed per millisecond.

Data Description

Three types of org.apache.http.client.methods.HttpGet requests are used, differing in header and URL length to simulate small, medium, and large message bodies.

def get = new HttpGet()

Medium object example:

def get = new HttpGet(url)
get.addHeader("token", token)
get.addHeader(HttpClientConstant.USER_AGENT)
get.addHeader(HttpClientConstant.CONNECTION)

Large object example:

def get = new HttpGet(url + token)
get.addHeader("token", token)
get.addHeader("token1", token)
get.addHeader("token5", token)
get.addHeader("token4", token)
get.addHeader("token3", token)
get.addHeader("token2", token)
get.addHeader(HttpClientConstant.USER_AGENT)
get.addHeader(HttpClientConstant.CONNECTION)

Producer

Object Size

Queue Length (M)

Threads

Rate (/ms)

Small

838

Small

837

Small

823

Small

483

Small

450

Medium

301

Medium

322

Medium

320

Medium

271

Medium

Failure

Medium

Failure

Medium

0.5

351

Medium

0.5

375

Large

214

Large

240

Large

241

Large

0.5

209

Large

0.5

250

Large

0.5

246

Large

0.2

217

Large

0.2

309

Large

0.2

321

Large

0.2

243

Two middle tests failed because the wait time became too long and the process stalled around 3 million operations.

Conclusions for org.apache.http.client.methods.HttpRequestBase messages:

Keep length around one hundred thousand.

Use 5‑10 producer threads.

Make the message body as small as possible.

Consumer

Object Size

Queue Length (M)

Threads

Rate (/ms)

Small

1893

Small

1706

Small

1594

Small

1672

Small

2544

Small

2024

Small

3419

Medium

1897

Medium

1485

Medium

1345

Medium

1430

Medium

2971

Medium

1576

Large

1980

Large

1623

Large

1689

Large

0.5

1136

Large

0.5

1096

Large

0.5

1072

Conclusions for org.apache.http.client.methods.HttpRequestBase messages:

Longer messages tend to improve throughput.

Fewer consumer threads are better.

Keep the message body as small as possible.

The main difference from the producer side is that less lock contention and larger message volumes lead to higher speeds.

Producer & Consumer Combined

Thread count refers to the number of producers or consumers; the total thread count is twice this number.

Object Size

Runs (M)

Threads

Queue Length (M)

Rate (/ms)

Small

0.1

1326

Small

0.2

1050

Small

0.5

1054

Small

0.1

1091

Small

0.1

1128

Small

0.1

1798

Small

0.2

1122

Small

0.2

946

Small

0.1

1079

Small

0.1

1179

Medium

0.1

632

Medium

0.2

664

Medium

0.2

718

Medium

0.2

683

Medium

0.2

675

Medium

0.2

735

Medium

0.2

788

Medium

0.2

828

Large

0.1

505

Large

0.2

558

Large

0.2

609

Large

0.2

496

Large

0.2

523

Large

0.2

759

Large

0.2

668

Test Cases (Groovy)

The test cases are written in Groovy, using a custom asynchronous keyword fun and closures to simplify multithreaded code. Below are three representative scenarios.

Producer Scenario

package com.funtest.groovytest

import com.funtester.config.HttpClientConstant
import com.funtester.frame.SourceCode
import com.funtester.utils.CountUtil
import com.funtester.utils.Time
import org.apache.http.client.methods.HttpGet
import org.apache.http.client.methods.HttpRequestBase
import java.util.concurrent.CountDownLatch
import java.util.concurrent.LinkedBlockingQueue
import java.util.concurrent.atomic.AtomicInteger

class QueueT extends SourceCode {
    static AtomicInteger index = new AtomicInteger(0)
    static int total = 100_0000
    static int size = 10
    static int threadNum = 1
    static int piece = total / size
    static def url = "http://localhost:12345/funtester"
    static def token = "FunTesterFunTesterFunTesterFunTesterFunTesterFunTesterFunTester"

    public static void main(String[] args) {
        LinkedBlockingQueue<HttpRequestBase> linkedQ = new LinkedBlockingQueue<>()
        def start = Time.getTimeStamp()
        def latch = new CountDownLatch(threadNum)
        def barrier = new CyclicBarrier(threadNum + 1)
        def funtester = {
            fun {
                barrier.await()
                while (true) {
                    if (index.getAndIncrement() % piece == 0) {
                        def l = Time.getTimeStamp() - start
                        ts << l
                        output("${formatLong(index.get())} add cost ${formatLong(l)}")
                        start = Time.getTimeStamp()
                    }
                    if (index.get() > total) break
                    def get = new HttpGet(url)
                    get.addHeader("token", token)
                    get.addHeader(HttpClientConstant.USER_AGENT)
                    get.addHeader(HttpClientConstant.CONNECTION)
                    linkedQ.put(get)
                }
                latch.countDown()
            }
        }
        threadNum.times { funtester() }
        def st = Time.getTimeStamp()
        barrier.await()
        latch.await()
        def et = Time.getTimeStamp()
        outRGB("Rate per ms ${total / (et - st)}")
        outRGB(CountUtil.index(ts).toString())
    }
}

Consumer Scenario

package com.funtest.groovytest

import com.funtester.config.HttpClientConstant
import com.funtester.frame.SourceCode
import com.funtester.utils.CountUtil
import com.funtester.utils.Time
import org.apache.http.client.methods.HttpGet
import org.apache.http.client.methods.HttpRequestBase
import java.util.concurrent.CountDownLatch
import java.util.concurrent.CyclicBarrier
import java.util.concurrent.LinkedBlockingQueue
import java.util.concurrent.TimeUnit
import java.util.concurrent.atomic.AtomicInteger

class QueueTconsume extends SourceCode {
    static AtomicInteger index = new AtomicInteger(1)
    static int total = 100_0000
    static int size = 10
    static int threadNum = 5
    static int piece = total / size
    static def url = "http://localhost:12345/funtester"
    static def token = "FunTesterFunTesterFunTesterFunTesterFunTesterFunTesterFunTester"

    public static void main(String[] args) {
        LinkedBlockingQueue<HttpRequestBase> linkedQ = new LinkedBlockingQueue<>()
        def pwait = new CountDownLatch(10)
        def produces = {
            fun {
                while (true) {
                    if (linkedQ.size() > total) break
                    def get = new HttpGet(url)
                    get.addHeader("token", token)
                    get.addHeader(HttpClientConstant.USER_AGENT)
                    get.addHeader(HttpClientConstant.CONNECTION)
                    linkedQ.add(get)
                }
                pwait.countDown()
            }
        }
        10.times { produces() }
        pwait.await()
        outRGB("Data prepared! ${linkedQ.size()}")
        def start = Time.getTimeStamp()
        def barrier = new CyclicBarrier(threadNum + 1)
        def latch = new CountDownLatch(threadNum)
        def funtester = {
            fun {
                barrier.await()
                while (true) {
                    if (index.getAndIncrement() % piece == 0) {
                        def l = Time.getTimeStamp() - start
                        ts << l
                        output("${formatLong(index.get())} consume cost ${formatLong(l)}")
                        start = Time.getTimeStamp()
                    }
                    def poll = linkedQ.poll(100, TimeUnit.MILLISECONDS)
                    if (poll == null) break
                }
                latch.countDown()
            }
        }
        threadNum.times { funtester() }
        def st = Time.getTimeStamp()
        barrier.await()
        latch.await()
        def et = Time.getTimeStamp()
        outRGB("Rate per ms ${total / (et - st)}")
        outRGB(CountUtil.index(ts).toString())
    }
}

Producer & Consumer Combined Scenario

This test pre‑fills the queue to a specified initial length before running producer and consumer threads.

package com.funtest.groovytest

import com.funtester.frame.SourceCode
import com.funtester.utils.Time
import org.apache.http.client.methods.HttpGet
import org.apache.http.client.methods.HttpRequestBase
import java.util.concurrent.CountDownLatch
import java.util.concurrent.CyclicBarrier
import java.util.concurrent.LinkedBlockingQueue
import java.util.concurrent.TimeUnit
import java.util.concurrent.atomic.AtomicInteger

class QueueBoth extends SourceCode {
    static AtomicInteger index = new AtomicInteger(1)
    static int total = 500_0000
    static int length = 50_0000
    static int threadNum = 5
    static def url = "http://localhost:12345/funtester"
    static def token = "FunTesterFunTesterFunTesterFunTesterFunTesterFunTesterFunTester"

    public static void main(String[] args) {
        LinkedBlockingQueue<HttpRequestBase> linkedQ = new LinkedBlockingQueue<>()
        def latch = new CountDownLatch(threadNum * 2)
        def barrier = new CyclicBarrier(threadNum * 2 + 1)
        def ts = []
        def funtester = { f ->
            {
                fun {
                    barrier.await()
                    while (true) {
                        if (index.getAndIncrement() > total) break
                        f()
                    }
                    latch.countDown()
                }
            }
        }
        def produces = {
            def get = new HttpGet(url)
            get.addHeader("token", token)
            get.addHeader(HttpClientConstant.USER_AGENT)
            get.addHeader(HttpClientConstant.CONNECTION)
            linkedQ.put(get)
        }
        length.times { produces() }
        threadNum.times {
            funtester(produces)
            funtester { linkedQ.poll(100, TimeUnit.MILLISECONDS) }
        }
        def st = Time.getTimeStamp()
        barrier.await()
        latch.await()
        def et = Time.getTimeStamp()
        outRGB("Rate per ms ${total / (et - st) / 2}")
    }
}

Additional Observations

The performance of LinkedBlockingQueue is highly unstable; logs show that maximum latency can be ten to twenty times the minimum when the queue length reaches one million. Reducing the queue length to 500 k mitigates this instability, so keeping the queue as short as possible is advisable.

Benchmark Summary

Using the FunTester framework, the following rates (operations per millisecond) were observed for different object sizes and thread counts:

Test Object

Threads

Count (M)

Rate (/ms)

Small

5681

Small

8010

Small

15105

Medium

1287

Medium

2329

Medium

4176

Large

807

Large

2084

Large

3185

The test cases used Groovy code similar to the snippets above, with configurable thread numbers, total request counts, and message sizes.

Have Fun ~ Tester ！

FunTester 2021 Summary

2022 Plan Template

"Programmers over 35 are eliminated" – 22 years old

Selenium JUnit Parameterization

QPS Sampler Implementation in Performance Test Framework

Testing Non‑Fixed Probability Algorithms

Mobile Test Engineer Career

Groovy Hot‑Update Java Practice

Java Thread‑Safe ReentrantLock

Interface Test Coverage (JaCoCo) Sharing

Selenium Python Tips (Part 3)

Console Color Output

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Concurrency Message Queue LinkedBlockingQueue

Written by

FunTester

10k followers, 1k articles | completely useless

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.