Backend Development 8 min read

Beyond Sharding: Unitization as a Solution to Unlimited Service Scaling

The article examines why traditional sharding and database partitioning cannot alone achieve limitless scaling, explains the problem of excessive RPC‑to‑DB connections, and proposes a unitization approach that limits each service to a single database shard to enable true horizontal expansion.

IT Xianyu

Sep 11, 2020

Beyond Sharding: Unitization as a Solution to Unlimited Service Scaling

Introduction

As a newcomer, I often wonder about JDK APIs, NIO, JVM, and after a few years of work I start questioning service availability and scalability, especially the classic issue of service expansion.

Typical Service Evolution Path

Let’s start from the beginning.

Monolithic applications – most startups begin with frameworks like SSM or SSH; everyone has experienced this.

RPC‑based applications – when business grows, horizontal scaling becomes necessary; scaling is simple as long as services remain stateless (see diagram).

As the business further expands, service relationships become complex and many services only need cache access, not a database, allowing separation and reducing precious DB connections (see diagram).

Most companies reach this stage, and Dubbo was created to address it.

If a product becomes popular, data volume increases and SQL operations slow down, the database becomes a bottleneck, leading to sharding or partitioning by ID hash or range (see diagram).

At this point it seems the problem is solved: just keep adding more database instances and application instances.

But does sharding truly enable unlimited scaling?

In reality, the architecture shown above does not solve the core issue.

The real problem, similar to RPC, is excessive database connections.

Typically, RPC applications use middleware to access databases, so the application does not know which specific database to query; the middleware decides based on rules such as Sharding‑JDBC. Consequently, each RPC instance must maintain connections to all databases. For example, if an RPC app needs to connect to three MySQL instances and there are 30 such apps, with each connection pool size of 8, the total connections per MySQL reach 240, exceeding MySQL’s default limit of 100 and its maximum of 16384. When the number of apps exceeds 2048, further scaling becomes impossible.

Note that because each physical database contains many logical databases and micro‑service adoption is booming, the number 2048 is not as large as it appears.

Adding a proxy in front does not solve the problem either, because the proxy itself has a connection limit (also 16384). If concurrent connections exceed that, the proxy becomes a bottleneck.

What can we do? Look again at the architecture diagram (see image).

The issue is that “every RPC app connects to every database,” so scaling the application also scales the total number of database connections, and adding more databases does not alleviate the connection‑count problem.

Unitization

Unitization sounds fancy and is often mentioned alongside terms like “two‑site three‑center” or “multi‑active‑active” in conferences.

Here we focus solely on the “too many database connections” problem.

The idea is simple: prevent applications from connecting to all databases.

Assume we partition data into 10 databases and have 10 applications, each connecting to one database. When the number of applications grows to 20, we split the 10 databases into 20, ensuring each app still connects to only one database. Thus, regardless of how many applications are added, the connection‑count issue is resolved.

Prerequisite: the request handling the application must always target the database assigned to that application.

In practice, this can be achieved by determining the target database before DNS resolution, for example using a user‑ID hash broadcasted via a configuration center, ensuring all components follow the same routing rule (see diagram).

With this approach, the unlimited scaling problem is finally addressed.

Conclusion

This article traced the evolution from monolithic to RPC‑based services, demonstrating that sharding alone cannot solve unlimited scaling, and that unitization is required, albeit at the cost of added complexity.

Unitization brings many benefits, but we have not yet discussed single‑point‑of‑failure concerns regarding service availability, as the databases in this scenario remain single points.

<strong>END</strong>

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Scalability Sharding unitization database connections

Written by

IT Xianyu

We share common IT technologies (Java, Web, SQL, etc.) and practical applications of emerging software development techniques. New articles are posted daily. Follow IT Xianyu to stay ahead in tech. The IT Xianyu series is being regularly updated.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.