Backend Development 10 min read

Performance Evaluation of Dubbo 3.0 Address Push: Interface vs Application-Level Service Discovery

This article presents a performance comparison of Dubbo 2, Dubbo 3 interface‑level, and Dubbo 3 application‑level address discovery models under a million‑instance scenario, showing significant reductions in memory usage and GC frequency for the newer models.

High Availability Architecture

Dec 22, 2020

Performance Evaluation of Dubbo 3.0 Address Push: Interface vs Application-Level Service Discovery

1 Abstract

This article reports a performance test of the next‑generation microservice framework Dubbo 3.0 focusing on its address push pipeline. It qualitatively compares Dubbo 2 interface‑level discovery, Dubbo 3 interface‑level discovery, and Dubbo 3 application‑level discovery, noting that Dubbo 3 reduces resident memory by more than 50% compared with Dubbo 2 and that the application‑level model further cuts memory usage by about 40% while almost eliminating incremental memory allocation during scaling.

2 Background Introduction

2.1 Overview of Dubbo 3.0

Dubbo 3.0 is a fusion product of Alibaba's internal HSF framework and the open‑source Dubbo, upgraded to a cloud‑native architecture and intended to become the primary middleware for both internal and community users.

It was created to support Alibaba's overall migration to cloud services, consolidating development effort, improving product quality, and enabling seamless interaction between HSF and Dubbo ecosystems.

2.2 Performance Test and Comparison of Different Dubbo Versions

The test focuses on the orange‑highlighted address push path in the service framework, comparing consumer behavior across Dubbo 2, Dubbo 3 interface‑level, and Dubbo 3 application‑level models when pushing one million address instances.

Test scenarios:

Dubbo 2 – baseline reference.

Dubbo 3 interface‑level discovery (same model as Dubbo 2).

Dubbo 3 application‑level discovery (new cloud‑native model).

Test Environment and Method

Test Data : Simulated 2.2 million (interface‑level) cluster instance address pushes, i.e., a single consumer process subscribes to 2.2 M addresses.

Environment : 8‑core CPU, 16 GB RAM Linux machine, JVM heap set to 10 GB.

Method : Consumer process subscribes to 700 interfaces; ConserverServer acts as registry, continuously simulating address changes for over one hour while collecting GC and memory metrics.

3 Optimization Results and Comparison

3.1 GC Time and Distribution

Images show GC behavior for each model (Dubbo 2 interface, Dubbo 3 interface, Dubbo 3 application).

3.2 Incremental Memory Allocation

Images illustrate memory allocation trends for the three models, highlighting the reduction achieved by Dubbo 3.

3.3 Old Generation (OLD) Space and Resident Memory

Images compare OLD space usage and resident memory across the models, demonstrating that the application‑level model reduces resident memory by nearly 40% compared with the interface‑level model.

3.4 Consumer Load

Images depict consumer load metrics for Dubbo 3 interface‑level and application‑level models.

4 Detailed Comparison and Analysis

4.1 Dubbo 2 Interface Model vs Dubbo 3 Interface Model

Under 2 M address scale, Dubbo 2 quickly exhausts heap memory and triggers frequent GC, causing the process to become unresponsive. Optimized Dubbo 3 experiences only three Full GCs in one hour and reduces unreleased memory by about 1.7 GB.

4.2 Dubbo 3 Interface Model vs Dubbo 3 Application Model

Switching to the application‑level discovery model yields further resource reductions: incremental memory growth during instance scaling is minimal, and resident memory drops another ~40% to around 900 MB.

Future work includes optimizing metadata and URL object reuse in the application‑level implementation.

5 Conclusion

Dubbo 3.0 has successfully merged Dubbo and HSF, with cloud‑native features progressing rapidly. It proved stable during Alibaba’s recent Double‑11 shopping festival and is being piloted in other e‑commerce services. Ongoing efforts will focus on finalizing the application‑level service discovery, new governance rules, the next‑generation Triple protocol, and meeting the non‑functional goals of resource usage, performance, and cluster scalability.

The address‑push performance test validates the substantial resource savings of the application‑level model, reinforcing confidence in its scalability for future large‑scale clusters.

Reference Reading:

How Kafka Achieves Million‑TPS

Distributed Task Scheduling System Design: Go Implementation

Domain‑Driven Design Framework Axon Practice

Implementing Paxos in Go

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend Microservices performance testing GC optimization

Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.