Databases 18 min read

MySQL Optimization Guide: Storage Engines, Data Types, Index Strategies, and Query Tuning

This article provides a comprehensive collection of MySQL optimization techniques—including storage engine characteristics, data type selection, index design, and query rewriting—targeted at teams without a dedicated DBA, using MySQL 5.5 and InnoDB as the reference platform.

Top Architect
Top Architect
Top Architect
MySQL Optimization Guide: Storage Engines, Data Types, Index Strategies, and Query Tuning

MySQL Characteristics

Understanding MySQL’s unique storage‑engine architecture helps you choose the right engine for your workload; InnoDB is the default engine in MySQL 5.5.

InnoDB Features

Supports transactions

Row‑level locking

Data stored in tablespaces composed of multiple files

Multi‑Version Concurrency Control (MVCC) for high concurrency

Clustered index on the primary key

Supports hot backup

Other Common Engines

MyISAM – no transactions, table‑level lock, supports full‑text index

Archive – insert‑only, fast bulk inserts

CSV – treats each CSV file as a table

Memory – stores data in RAM

Data Type Optimization

Guidelines: use the smallest space‑efficient type, prefer simple types, avoid unnecessary nullable columns.

Integer Types

tinyint, smallint, mediumint, int, bigint use 8‑64 bits respectively; unsigned doubles the positive range.

Decimal Types

Precise but space‑heavy; decimal(18,9) stores 18 digits in 9 bytes. For large volumes, consider using bigint with scaling.

String Types

varchar stores length + data (1‑2 bytes for length), ideal for variable‑length strings; char is fixed‑length, better for short uniform strings; varbinary/binary for binary data; blob/text for large objects; enum for low‑cardinality repeated values.

Time Types

year, date, time, datetime, timestamp – timestamp stores seconds since 1970 and auto‑updates on INSERT/UPDATE.

Special Types

Store IPv4 addresses as unsigned int and use inet_aton()/inet_ntoa() for conversion.

Index Optimization

InnoDB uses B+‑tree indexes. Index column order matters: leftmost columns are used first.

CREATE TABLE `people` (
  `Id` int(11) NOT NULL AUTO_INCREMENT,
  `Name` varchar(5) NOT NULL,
  `Age` tinyint(4) NOT NULL,
  `Number` char(5) NOT NULL COMMENT '编号',
  PRIMARY KEY (`Id`),
  KEY `i_name_age_number` (`Name`,`Age`,`Number`)
) ENGINE=InnoDB AUTO_INCREMENT=14 DEFAULT CHARSET=utf8;

Good SQL examples (use leftmost index columns):

SELECT * FROM people WHERE Name='Abel' AND Age=2 AND Number=12312;

SELECT * FROM people WHERE Name='Abel';

SELECT * FROM people WHERE Name LIKE 'Abel%';

SELECT * FROM people WHERE Name='Andy' AND Age BETWEEN 11 AND 20;

SELECT * FROM people ORDER BY Name;

SELECT * FROM people ORDER BY Name, Age;

SELECT * FROM people GROUP BY Name;

Bad examples (skip leftmost columns or use leading wildcards):

SELECT * FROM people WHERE Age=2;

SELECT * FROM people WHERE Name LIKE '%B';

SELECT * FROM people WHERE NAME='ABC' AND number=3;

SELECT * FROM people WHERE NAME LIKE 'B%' AND age=22;

Hash Index Trick

Create a numeric hash column (e.g., URL_CRC) for long string columns and index it:

SELECT * FROM t WHERE URL_CRC = 387695885 AND URL = 'www.baidu.com';

Custom hash using MD5:

SELECT CONV(RIGHT(MD5('www.baidu.com'),16),16,10);

Prefix Index

Index only the first N characters of a long string column to reduce index size; not usable for ORDER BY/GROUP BY.

Multi‑Column Index

Combine columns that are frequently queried together; place the most selective column first unless query patterns dictate otherwise.

SELECT * FROM t WHERE f1='v1' AND f2 <> 'v2' UNION ALL SELECT * FROM t WHERE f2='v2' AND f1 <> 'v1';

Clustered Index

InnoDB stores the full row in the leaf nodes of the primary‑key index; only one clustered index per table.

Covering Index

If a query can be satisfied entirely from the index, the table rows are never read.

Redundant / Duplicate Indexes

Avoid creating separate indexes that are subsets of existing ones; only create a new index if it adds columns that improve selectivity.

Query Optimization

Common reasons for slow queries:

Fetching more rows than needed.

Fetching more columns than needed (SELECT *).

Repeated identical queries without caching.

Scanning extra rows due to missing or poorly ordered indexes.

Refactoring Queries

Break complex queries into simpler ones.

Split large queries into smaller batches.

Replace subqueries with joins when possible.

COUNT Optimization

Use approximate row counts from the execution plan, covering indexes, summary tables, or in‑memory caches instead of full COUNT(*) scans.

SELECT COUNT(name LIKE 'B%') FROM people;

Existence Check

Prefer a LIMIT‑1 subquery wrapped with IFNULL:

SELECT IFNULL((SELECT 1 FROM tableName WHERE condition LIMIT 1),0);

LIMIT Optimization

Avoid scanning rows that will be discarded; use a subquery to fetch primary keys first and then join:

SELECT * FROM sa_stockinfo ORDER BY StockAcc LIMIT 400,5;

Optimized version:

SELECT * FROM sa_stockinfo i JOIN (
  SELECT StockInfoID FROM sa_stockinfo ORDER BY StockAcc LIMIT 400,5
) t ON i.StockInfoID = t.StockInfoID;

UNION Optimization

Use UNION ALL unless you need duplicate elimination.

References

High Performance MySQL

SQLIndexingInnoDBMySQLDatabase OptimizationData TypesQuery Tuning
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.