Databases 7 min read

Debugging ClickHouse Source Code with GDB: A Step‑by‑Step Guide

This article documents the author's first experience debugging ClickHouse source code with GDB, explains the memory‑limit issue encountered when importing large MySQL tables, shows how to inspect stack frames, adjust the max_block_size setting, and verify the fix with a test insert.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Debugging ClickHouse Source Code with GDB: A Step‑by‑Step Guide

Background – The author records their first attempt at using GDB to debug ClickHouse source code, aiming to understand internal mechanisms and resolve a memory‑limit error that occurs when importing a large MySQL table via ClickHouse’s mysql table function.

Debugging Issue – Importing ~500 000 rows (≈56 GB) exceeds the server’s 36 GB memory limit, triggering a Memory limit (total) exceeded exception. The problem stems from ClickHouse buffering rows in memory before batch‑writing, with the buffer size apparently around 1 000 000 rows.

Stack Frame Printing – The author installs the clickhouse-common-static-dbg RPM matching the ClickHouse version (e.g., 20.8.12.2) and uses pstack to capture the process stack, saving it to /opt/ck_pstack.log .

GDB Debugging – Using CGDB on CentOS 7.9, the steps are:

Open a terminal with clickhouse-client and wait for an INSERT.

Attach CGDB to the ClickHouse PID, set a breakpoint at DB::SourceFromInputStream::generate , and configure signal handling to ignore SIGUSR1 and SIGUSR2 .

Execute the INSERT in the client window.

Continue execution in CGDB to hit the breakpoint, then step through the code and inspect parameters.

(gdb) att 1446
(gdb) handle SIGUSR2 noprint nostop
(gdb) handle SIGUSR1 noprint nostop
(gdb) b DB::SourceFromInputStream::generate
Breakpoint 1 at 0x16610f40: file ../src/Processors/Sources/SourceFromInputStream.cpp, line 135.

max_block_size Investigation – While stepping, the author discovers that when the number of rows read equals max_block_size (value observed: 1048545), ClickHouse exits the read loop, writes the batch, and frees memory. The related configuration parameter is min_insert_block_size_rows , which controls the threshold.

Testing the Fix – By setting min_insert_block_size_rows = 10000 (1 w rows) and re‑running the INSERT, the operation completes without memory errors, confirming that lowering the batch size avoids the limit.

localhost :) set min_insert_block_size_rows = 10000;

localhost :) insert into `test`.`memory_test`  select * from mysql('192.168.213.222:3306','test','memory_test','root','xxxx');

INSERT INTO test.memory_test SELECT * FROM mysql('192.168.213.222:3306', 'test', 'memory_test', 'root', 'xxxx')
Ok.

0 rows in set. Elapsed: 2065.189 sec. Processed 500.00 thousand rows, 51.23 GB (242.11 rows/s., 24.81 MB/s.)

After the test, show processlist confirms that ClickHouse writes after every 10 000 rows, eliminating the memory‑exhaustion problem.

debuggingperformancedatabaseconfigurationClickHouseGDBmax_block_size
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.