Designing a High‑Performance System to Import 1 Billion Records into MySQL
To import one billion 1 KB log records stored in HDFS or S3 into MySQL as quickly as possible, the article analyzes constraints, evaluates single‑table limits, proposes sharding, batch inserts, storage‑engine choices, file‑reading strategies, task coordination, reliability mechanisms, and concurrency control.
This article revisits a Java interview question that asks how to insert 1 billion rows (each about 1 KB) as fast as possible into a MySQL database. The data are unstructured access logs stored in HDFS or S3, split into roughly 100 files, and must be imported in order with minimal duplication.
1. Can a single MySQL table hold 1 billion rows?
The answer is no – a single table is recommended to stay below 20 million rows. MySQL uses a B+‑tree clustered index; with a leaf page size of 16 KB and a row size of 1 KB, each leaf page holds 16 rows. A non‑leaf page (also 16 KB) stores a BigInt primary key (8 bytes) and a pointer (6 bytes), allowing roughly 1 170 child pointers ( 16*1024/14 ). This limits the tree depth and explains why tables larger than ~20 M rows require a fourth index level, degrading performance.
2. Efficient database writes
Single‑row inserts are slow; batch inserts (e.g., 100 rows per batch) dramatically improve throughput. InnoDB guarantees atomicity of batch transactions. To keep inserts fast, write rows in primary‑key order, avoid non‑primary indexes during the bulk load, or create them after the data is loaded.
Should we write concurrently to the same table?
No. Concurrent writes to a single table break ordering guarantees. Instead, increase the batch size to raise effective concurrency without parallel writes.
3. Choosing a MySQL storage engine
MyISAM offers higher raw insert speed but lacks transaction support, risking duplicate rows on failure. InnoDB, with innodb_flush_log_at_trx_commit set to 0 or 2, can approach MyISAM performance while providing ACID guarantees. The article recommends using InnoDB unless the cluster disallows changing the flush policy.
4. Need for sharding
Single‑instance MySQL caps at about 5 K TPS. With SSDs the limit is higher than HDDs, but HDDs cannot handle many concurrent writes due to a single read/write head. Therefore, the design should allow configurable numbers of databases and tables, adapting to the underlying storage.
5. Maintaining order during import
Assign each of the 100 files a unique database and table (e.g., index_90.txt → database_9.table_0 ). The combination of file suffix and line number forms a unique, ordered primary key.
6. Faster file reading
Reading a 10 GB file cannot be done in one go. Various Java approaches were benchmarked (Files.readAllBytes, BufferedReader, Scanner, NIO FileChannel). Java NIO with ByteBuffer is fastest but does not provide line boundaries. BufferedReader offers acceptable performance (~30 s for 10 GB) and natural line handling, making it the preferred method.
File file = new File("/xxx.zip");
FileInputStream fileInputStream = null;
long now = System.currentTimeMillis();
try {
fileInputStream = new FileInputStream(file);
FileChannel fileChannel = fileInputStream.getChannel();
int capacity = 1 * 1024 * 1024; // 1M
ByteBuffer byteBuffer = ByteBuffer.allocate(capacity);
StringBuffer buffer = new StringBuffer();
int size = 0;
while (fileChannel.read(byteBuffer) != -1) {
byteBuffer.clear();
byte[] bytes = byteBuffer.array();
size += bytes.length;
}
System.out.println("file size:" + size);
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
// TODO close resources.
}
System.out.println("Time:" + (System.currentTimeMillis() - now));7. Coordinating read and write tasks
Attempting to separate readers and writers via Kafka broke ordering guarantees and introduced partition‑mixing problems. The final solution merges reading and writing in a single task: each task reads a batch, then writes it, preserving order without extra messaging.
8. Ensuring task reliability
Use a deterministic primary key composed of {taskId}{fileIndex}{fileRowNumber} to achieve idempotent inserts. Record progress in Redis with INCRBY task_offset_{taskId} 100 . On failure, retry the batch; after several failures, fall back to single‑row inserts and continue updating Redis. Optionally, consume MySQL binlog to keep Redis and the database in sync.
INCRBY KEY_NAME INCR_AMOUNT9. Controlling read‑task concurrency
Limit the number of active read‑write tasks per database using a distributed semaphore (Redisson). Each node acquires a permit before processing a task.
RedissonClient redissonClient = Redisson.create(config);
RSemaphore rSemaphore = redissonClient.getSemaphore("semaphore");
// set concurrency to 1
rSemaphore.trySetPermits(1);
rSemaphore.tryAcquire(); // non‑blocking acquireBecause Redisson semaphores lack renewal, the design switches to a leader‑election model using Zookeeper+Curator. The leader assigns tasks, publishes Kafka messages, and workers acquire a distributed lock (with renewal) before processing.
10. Summary
Clarify constraints before designing a solution.
Large data volume requires sharding and careful table sizing.
Single‑instance write bottlenecks dictate the need for multiple databases.
Limit concurrent writes per database and allow dynamic tuning.
Compare MySQL storage engines (InnoDB vs MyISAM) in production.
Determine optimal batch size through testing.
Combine read and write tasks to avoid ordering issues.
Use Redis to track progress and guarantee idempotency.
Employ Zookeeper‑based leader election and distributed locks to coordinate tasks safely.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.