Databases 8 min read

Introducing ClickHouse: Columnar Database Overview, MySQL Migration, Performance Comparison, and Practical Tips

This article introduces ClickHouse, explains its column‑oriented architecture versus row‑oriented databases, details a MySQL‑to‑ClickHouse migration, shows a performance benchmark that reduces query time from minutes to seconds, and shares practical deployment tips and common pitfalls.

Architect's Tech Stack
Architect's Tech Stack
Architect's Tech Stack
Introducing ClickHouse: Columnar Database Overview, MySQL Migration, Performance Comparison, and Practical Tips

1. What is ClickHouse?

ClickHouse is an open‑source, column‑store database from Yandex designed for real‑time analytics, offering processing speeds 100‑1000 times faster than traditional methods. Its performance surpasses comparable columnar DBMSs, handling billions of rows and tens of gigabytes per server per second.

We first clarify some basic concepts

OLTP: Traditional relational databases focused on insert, update, delete, and query operations with strong transaction consistency (e.g., banking, e‑commerce systems).

OLAP: Warehouse‑type databases aimed at reading data for complex analysis, supporting decision‑making with simple, intuitive results.

Next we use diagrams to understand the difference between columnar and row‑oriented databases.

In traditional row‑oriented DB systems (MySQL, PostgreSQL, MS SQL Server), data is stored row by row:

In columnar DB systems (ClickHouse), data is stored column by column:

Comparison of storage methods:

The above is a basic introduction to ClickHouse; see the official documentation for more details.

https://clickhouse.tech/docs/zh/

2. Business Problem

The business data resides in MySQL with a 50 million‑row main table and two auxiliary tables; a single join query takes over 3 minutes. After indexing, sharding, and logical optimization, performance remained poor, prompting a migration to ClickHouse.

Result: query time reduced to under 1 second, achieving a 200× speedup.

We hope this article helps readers quickly master this powerful tool and avoid common pitfalls.

3. ClickHouse Practice

1. Installing ClickHouse on macOS

Installed via Docker; alternatively, you can compile from source, which is more cumbersome.

2. Data Migration: MySQL to ClickHouse

ClickHouse supports most MySQL syntax, making migration low‑cost. Five migration approaches are available:

create table engine mysql – mapping; data stays in MySQL.

insert into select from – create table first, then import.

create table as select from – create and import simultaneously.

CSV offline import.

streamsets.

We chose the third method:

CREATE TABLE [IF NOT EXISTS] [db.]table_name ENGINE = Mergetree AS SELECT * FROM mysql('host:port','db','database','user','password')

3. Performance Test Comparison

Type

Data Volume

Table Size

Query Speed

MySQL

50 million

10 GB

205 s

ClickHouse

50 million

600 MB

under 1 s

4. Data Synchronization方案

Temporary Table

Create a temp intermediate table, fully sync MySQL data into the ClickHouse temp table, then replace the original ClickHouse table; suitable for moderate data volumes with frequent incremental updates.

synch

Open‑source synchronization tool recommended: synch.

https://github.com/long2ice/synch/blob/dev/README-zh.md

The principle is to read MySQL binlog, extract SQL statements, and consume tasks via a message queue.

5. Why is ClickHouse Fast?

Only reads the columns needed for computation, reducing I/O.

Same‑type columns enable up to ten‑fold compression, further lowering I/O.

ClickHouse applies specialized search algorithms tailored to storage scenarios.

4. Pitfalls Encountered

1. Data Type Differences Between ClickHouse and MySQL

Running MySQL queries caused errors:

Solution: use LEFT JOIN B b ON toUInt32(h.id) = toUInt32(ec.post_id) to unify unsigned types.

2. Delete or Update Operations Are Asynchronous, Providing Only Eventual Consistency

The ClickHouse manual shows that even the Mergetree engine guarantees only eventual consistency.

If strong consistency is required, a full data sync is recommended.

5. Summary

Through the ClickHouse practice, we completely resolved the MySQL query bottleneck; queries on datasets up to 2 billion rows can return results within 1 second, and ClickHouse also supports clustering for larger scales.

Data MigrationPerformanceSQLClickHouseMySQLData SynchronizationColumnar Database
Architect's Tech Stack
Written by

Architect's Tech Stack

Java backend, microservices, distributed systems, containerized programming, and more.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.