Databases 25 min read

MySQL Single Table Optimization, Sharding, Partitioning, and Scaling Techniques

When a MySQL table grows large, performance degrades sharply, so this guide explains single‑table tuning, proper field choices, index strategies, query best practices, engine differences, system parameters, hardware upgrades, read‑write splitting, caching layers, table partitioning, vertical and horizontal sharding, and how to choose suitable sharding solutions.

Java Architect Essentials
Java Architect Essentials
Java Architect Essentials
MySQL Single Table Optimization, Sharding, Partitioning, and Scaling Techniques

Single Table Optimization

Unless a table will continuously grow, avoid early splitting because it adds logical, deployment, and operational complexity. For integer‑based tables, keep rows under ten million; for string‑heavy tables, stay under five hundred thousand without major issues.

Fields

Prefer TINYINT , SMALLINT , MEDIUMINT over INT ; add UNSIGNED if non‑negative.

Allocate only the necessary length for VARCHAR .

Use enums or integers instead of strings.

Prefer TIMESTAMP over DATETIME .

Keep the number of columns under 20.

Avoid NULL columns as they hinder optimization and waste index space.

Store IP addresses as integers.

Indexes

Create indexes only on columns used in WHERE or ORDER BY clauses; verify with EXPLAIN .

Avoid indexing columns that are frequently compared to NULL in WHERE .

Do not index low‑cardinality fields such as gender.

Use prefix indexes for character columns and avoid making them primary keys.

Prefer application‑enforced constraints over foreign keys, UNIQUE , or NULL checks.

When using composite indexes, keep column order consistent with query conditions and drop unnecessary single‑column indexes.

SQL Queries

Enable slow‑query log to locate expensive statements.

Never perform column calculations in the WHERE clause (e.g., SELECT id WHERE age + 1 = 10 ).

Keep SQL simple; avoid SELECT * , large statements, and functions/triggers in the database.

Replace OR with IN for better performance; limit IN list size to about 200.

Avoid %xxx pattern matching and excessive JOIN s.

Compare values of the same type (e.g., string vs string, number vs number).

Do not use != or <> in WHERE as it disables index usage.

Use BETWEEN for continuous numeric ranges instead of IN .

Paginate results with LIMIT and keep page size reasonable.

Engines

MySQL mainly uses MyISAM and InnoDB.

MyISAM

No row locking, no transactions, no foreign keys, no crash‑safe recovery.

Supports table‑level locking, full‑text indexing, and delayed index updates.

InnoDB

Row locking with MVCC, supports transactions, foreign keys, and crash‑safe recovery.

No full‑text index support.

Generally, MyISAM suits SELECT -heavy tables, while InnoDB fits INSERT / UPDATE -heavy workloads.

System Tuning Parameters

Common benchmarking tools: sysbench , iibench-mysql , tpcc-mysql . Important parameters include:

back_log : increase from default 50 to 500 to allow more pending connections.

wait_timeout : reduce idle connection time from 8 hours to 30 minutes.

max_user_connection : set a reasonable upper limit.

thread_concurrency : set to twice the CPU core count.

skip_name_resolve : disable DNS lookups for client IPs.

key_buffer_size : for MyISAM index cache, typically 256‑384 MiB on a 4 GiB system.

innodb_buffer_pool_size : largest impact on InnoDB performance; monitor Innodb_buffer_pool_read_requests vs Innodb_buffer_pool_reads .

innodb_log_buffer_size : usually keep below 32 MiB.

query_cache_size : adjust based on hit rate; often 256 MiB is sufficient.

read_buffer_size , sort_buffer_size , read_rnd_buffer_size , record_buffer , thread_cache_size , table_cache : tune according to workload.

Hardware Upgrade

Scale up by increasing CPU, memory, or switching to SSDs, depending on whether MySQL is CPU‑ or I/O‑bound.

Read‑Write Splitting

Use a master for writes and replicas for reads; avoid multi‑master setups to reduce complexity.

Caching

Caching can be applied at multiple layers:

MySQL internal caches (tuned via system parameters).

Data‑access layer (e.g., MyBatis cache, Hibernate persistence‑object cache).

Application service layer (caching Data Transfer Objects).

Web layer and browser client.

Two common write‑through strategies:

Write‑Through : update cache and DB simultaneously (simple, consistent).

Write‑Back : update cache first, flush to DB asynchronously (higher throughput, more complex).

Table Partitioning

MySQL supports horizontal partitioning (RANGE, LIST, HASH, KEY) introduced in 5.1. Partitioning is transparent to applications but indexes are per‑partition, no global index.

Benefits include larger logical tables, easier maintenance, faster queries that hit few partitions, and the ability to place partitions on different storage devices.

Limitations: max 1024 partitions, primary/unique keys must include partition columns, no foreign keys, NULL values break partition pruning, and all partitions must use the same engine.

Example of RANGE partitioning by year:

CREATE TABLE members (
    firstname VARCHAR(25) NOT NULL,
    lastname VARCHAR(25) NOT NULL,
    username VARCHAR(16) NOT NULL,
    email VARCHAR(35),
    joined DATE NOT NULL
)
PARTITION BY RANGE ( YEAR(joined) ) (
    PARTITION p0 VALUES LESS THAN (1960),
    PARTITION p1 VALUES LESS THAN (1970),
    PARTITION p2 VALUES LESS THAN (1980),
    PARTITION p3 VALUES LESS THAN (1990),
    PARTITION p4 VALUES LESS THAN MAXVALUE
);

Vertical Sharding

Split a table into frequently accessed columns and rarely changed columns, reducing row size and improving cache utilization. Drawbacks include redundant primary keys, extra JOINs, and the need for horizontal sharding for very large tables.

Horizontal Sharding

Distribute rows across multiple tables or databases based on a sharding key, achieving true distributed scaling. Table partitioning is a special case of horizontal sharding within a single server.

Sharding principles:

Only shard when necessary; keep shard count low and evenly distributed.

Choose sharding rules based on data growth, access patterns, and future expansion (range, list, consistent hash).

Avoid cross‑shard transactions; keep queries simple and indexed.

Consider data redundancy to reduce cross‑shard joins.

Solutions

Two main architectures:

Client‑side sharding : modify JDBC/MyBatis configuration to manage multiple data sources; low deployment cost but adds load to application servers.

Proxy‑side sharding : deploy an independent middleware (e.g., MySQL‑Fabric, Cobar, Atlas) that routes queries; higher operational cost but better scalability and transparency.

Comparison of Solutions

Product

Provider

Model

Supports MySQL

Sharding

Read‑Write Split

Open‑Source

Language

ShardingJDBC

Dangdang

Client

Yes

Yes

Yes

Yes

Java

MyCat

Community

Proxy

Yes

Yes

Yes

Yes

Java

Atlas

Qihoo 360

Proxy

Yes

Yes

Yes

Yes

C

Additional rows omitted for brevity

Recommended choices: ShardingJDBC for client‑side, MyCat or Atlas for proxy‑side.

MySQL‑Compatible Horizontally Scalable Databases

TiDB

Cubrid

Cloud offerings: Alibaba Cloud PetaData, Alibaba Cloud OceanBase, Tencent Cloud DCDB.

NoSQL Alternatives

Log, monitoring, and statistical data.

Unstructured or weakly structured data.

Data with low transactional requirements and few relationships.

These alternatives can completely eliminate horizontal scaling challenges for suitable workloads.

OptimizationIndexingShardingcachingMySQLpartitioningDatabaseScaling
Java Architect Essentials
Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.