Using ClickHouse for Time‑Series Data Management and Analysis in JD.com JUST Platform
This article explains how JD.com’s JUST platform leverages the open‑source columnar database ClickHouse to store, query and analyze massive time‑series data, covering data modeling, lifecycle management, system goals, technology selection, cluster architecture, deployment, scaling and future enhancements.
ClickHouse is an open‑source column‑oriented analytical database developed by Yandex, and JD.com’s JUST platform uses it to store and analyze massive time‑series data.
Time‑series data consists of metrics, timestamps, tags and values; typical use cases include monitoring device status, environmental measurements, financial tick data, and more.
The article outlines the full lifecycle of time‑series data—collection, storage, query, analysis and deletion—highlighting the need for high‑throughput writes, immutable records, petabyte‑scale storage, fast real‑time queries, high availability, scalability and ease of use.
After comparing several open‑source time‑series databases (OpenTSDB, InfluxDB, TDengine) the authors select ClickHouse for its superior compression, columnar storage and parallel query engine. Key performance figures show up to 200 MB/s write speed and millions of rows per second aggregation.
ClickHouse’s architecture is described: column files, ReplicatedMergeTree engine, distributed tables, sharding, replicas, multi‑master mode and the role of ZooKeeper for metadata synchronization.
Typical DDL for a time‑series table in JUST is shown:
create table my_ts_table as ts (
tag1 string,
tag2 string
) (
value1 double,
value2 double
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/...','{replica}')
PARTITION BY toYYYYMM(time)
ORDER BY (time);Cluster deployment strategies (2‑node, cross‑replica, primary‑secondary) and scaling formulas are discussed, as well as dynamic expansion of shards and replicas without downtime.
The system currently provides basic time‑range queries, tag filtering, down‑sampling and simple analytics, and future work includes real‑time ingestion, advanced aggregation panels, richer analysis functions and full SQL compatibility.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.