Databases 21 min read

Advantages of NewSQL Databases Over Middleware‑Based Sharding: Architecture, Transactions, HA, and Scaling

This article objectively compares NewSQL distributed databases with traditional middleware‑based sharding, examining their architectural advantages, distributed transaction handling, high‑availability mechanisms, scaling and sharding strategies, storage engine differences, ecosystem maturity, and provides guidance on selecting the appropriate solution for various workloads.

Architect's Guide
Architect's Guide
Architect's Guide
Advantages of NewSQL Databases Over Middleware‑Based Sharding: Architecture, Transactions, HA, and Scaling

Recently, during technical exchanges with peers, the author was frequently asked how to choose between sharding‑based middleware solutions and NewSQL distributed databases. This article aims to objectively compare the two approaches by analyzing their key characteristics, implementation principles, advantages, disadvantages, and suitable scenarios.

What Makes NewSQL Databases Advanced?

The paper pavlo‑newsql‑sigmodrec classifies Spanner, TiDB, and OceanBase as the first‑generation NewSQL architectures, while middleware solutions such as Sharding‑Sphere, Mycat, and DRDS belong to a second generation. Although middleware‑based sharding distributes data, it duplicates SQL parsing and execution‑plan generation, leading to inefficiency. For the purpose of this article, "NewSQL" refers to the newer architecture that eliminates these redundancies.

Compared with middleware‑based sharding, NewSQL databases offer several architectural benefits, illustrated in the diagram below:

Traditional databases are disk‑oriented and less efficient at memory‑based storage management and concurrency control.

Middleware repeats SQL parsing and optimizer work, reducing efficiency.

NewSQL distributed transactions are optimized beyond classic XA, delivering higher performance.

NewSQL stores data using Paxos or Raft multi‑replica protocols, achieving true high availability (RTO < 30 s, RPO = 0).

Built‑in automatic sharding, migration, and scaling reduce DBA workload and are transparent to applications.

The following sections discuss each of these points in detail.

Distributed Transactions

Distributed transactions are a double‑edged sword.

CAP Limitation

Early NoSQL systems avoided distributed transactions due to the CAP theorem, which forces a trade‑off between consistency, availability, and partition tolerance. NewSQL systems such as Google Spanner claim "practically CA" by operating on a private global network that minimizes partitions and by employing a highly efficient operations team.

In distributed systems you can know where work is done or when it finishes, but you cannot know both simultaneously; two‑phase commit is fundamentally an anti‑availability protocol.

Completeness

Two‑phase commit (2PC) does not guarantee strict ACID under all failure scenarios; recovery mechanisms are needed to achieve eventual consistency. Industry feedback indicates that many NewSQL products have incomplete distributed‑transaction support, leading to varying levels of reliability.

Performance

Traditional databases use XA, which incurs high network overhead and long blocking times. NewSQL implementations often adopt Google Percolator‑style models with Timestamp Oracle, MVCC, and Snapshot Isolation, reducing lock contention and allowing parts of the commit to be asynchronous, thus improving performance over XA.

SI is optimistic locking; under hotspot workloads it may cause many aborts, and its isolation level differs from true Repeatable Read.

Nevertheless, the extra steps in 2PC—global ID acquisition, network round‑trips, and log persistence—still impose noticeable latency, especially when many nodes participate.

Spanner distributed‑transaction benchmark results:

While NewSQL products promote full transaction support, best practice still recommends minimizing distributed transactions when possible.

For high‑throughput OLTP workloads, flexible (BASE) transaction models such as Saga, TCC, or reliable messaging are often more appropriate than strict ACID.

Beyond 2PC, other approaches exist for distributed transaction handling (see "its‑time‑to‑move‑on‑from‑two‑phase").

HA and Multi‑Active Deployments

Master‑slave replication, even with semi‑synchronous mode, can lose data under extreme conditions. Modern NewSQL databases adopt Paxos or Raft multi‑replica protocols, providing automatic leader election, fast failover, and high reliability.

Implementing production‑grade consistency algorithms requires careful engineering, batching, and asynchronous techniques to reduce network and I/O overhead.

Geographically distributed active‑active setups are limited by network latency; high latency can make true multi‑active OLTP impractical.

One practical approach used by Ant Group involves application‑level dual‑write via MQ, caching transaction data in a distributed cache, and managing a temporary blacklist during synchronization.

Scale, Horizontal Expansion, and Sharding Mechanism

Paxos solves HA but not scaling; therefore built‑in sharding is essential. NewSQL databases automatically split hot regions (e.g., TiDB splits a region at 64 MiB) and migrate data without application changes.

Sharding‑based systems can also achieve online scaling through asynchronous replication, read‑only phases, and coordinated routing switches.

However, a uniform sharding strategy may not align with domain models, leading to cross‑shard transactions in certain workloads (e.g., banking).

Distributed SQL Support

NewSQL aims to be a general‑purpose database, supporting full SQL including cross‑shard joins and aggregations. Middleware solutions often lack robust cross‑shard capabilities and may not support stored procedures, views, or foreign keys.

NewSQL engines can generate cost‑based execution plans (CBO) using distributed statistics, whereas middleware typically relies on rule‑based optimization (RBO), limiting complex query support.

Middleware reflects a compromise design focused on application compatibility, while NewSQL pursues a comprehensive, high‑performance backend.

Storage Engine

Traditional relational databases use B‑Tree storage optimized for disk access. NewSQL engines often adopt LSM‑tree designs, converting random writes into sequential writes for higher write throughput, at the cost of more complex reads.

Maturity and Ecosystem

Evaluating distributed databases requires multi‑dimensional testing: development status, community, monitoring tools, DBA talent, SQL compatibility, performance, HA, online DDL, etc. NewSQL products are still maturing, with strong adoption in internet companies but cautious use in regulated industries.

Traditional RDBMS benefit from decades of stability, extensive tooling, and broader talent pools, making them a safer choice for mission‑critical systems.

Other features such as online DDL, data migration, and operational tools are omitted for brevity.

Conclusion

When deciding between NewSQL and middleware‑based sharding, consider the following questions:

Is strong consistency at the database layer mandatory?

Is data growth unpredictable?

Does scaling frequency exceed operational capacity?

Is throughput more important than latency?

Must the solution be completely transparent to applications?

Do you have a DBA team experienced with NewSQL?

If two or three of these are affirmative, NewSQL may be worth the learning curve. Otherwise, sharding with middleware remains a lower‑risk, lower‑cost alternative.

Both approaches have trade‑offs; NewSQL is not a silver bullet, and traditional sharding continues to be a reliable choice for many enterprises.

Many software selection decisions depend on domain characteristics and architect preferences.

Promotional Content

Free official ChatGPT 4.0 and Claude Pro, stable with after‑sales support

Read this article to get unlimited domestic usage of ChatGPT and Claude Pro.

Other recommended resources: Domestic unlimited use of official ChatGPT and Claude Pro

Additional learning materials: Linux notes by a Tsinghua graduate, SpringBoot+SpringCloud guide, etc.

ShardingHigh AvailabilityDatabase ArchitectureNewSQLdistributed transactionsRaftPaxos
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.