GreptimeDB Distributed Architecture, Transparent Caching, and Flow‑Based Real‑Time Analytics
GreptimeDB solves front‑end observability challenges with a distributed architecture (frontend, datanode, flownode, metasrv), transparent two‑level caching, elastic scaling, and an SQL‑based flow engine for real‑time multi‑granularity aggregation and approximate counting, delivering millisecond query latency and cost‑effective storage.
This article addresses the pain points of front‑end observability—high‑frequency data ingestion, multi‑granularity aggregation, and resource inefficiency—and proposes GreptimeDB as an end‑to‑end solution.
Deployment architecture : GreptimeDB consists of four node types—Frontend (stateless request handling), Datanode (data storage and query), Flownode (stream‑processing), and Metasrv (metadata service). Regions (sharded tables) can be migrated between datanodes using the migrate_region function, enabling elastic scaling.
Transparent data caching : GreptimeDB abstracts the storage layer and provides a two‑level cache. Disk cache acts as a page‑cache for object‑storage files, while memory cache stores both raw file bytes and deserialized structures (e.g., min/max, bloom filters). This reduces latency for hot objects.
Fearless scaling : The system is cloud‑native; Metasrv collects node load and the scheduler (e.g., Kubernetes) can adjust replica counts for read‑write separation. Region migration and load‑aware traffic routing ensure high availability.
GreptimeDB Flow practice : Flow is a lightweight SQL‑based stream engine for time‑series. It supports continuous multi‑granularity aggregation:
10‑second hot table:
CREATE FLOW rpc_cost_10s
SINK TO rpc_cost_10s_agg
EXPIRE AFTER '12hours'::INTERVAL
AS SELECT app_name, url,
date_bin('10s'::INTERVAL, timestamp) AS time_window,
uddsketch(cost_time_ms, 0.01, 0.001) AS cost_sketch
FROM rpc_cost_time
GROUP BY app_name, url, date_bin('10s'::INTERVAL, timestamp);1‑minute roll‑up:
CREATE FLOW rpc_cost_1m
SINK TO rpc_cost_1m_agg
EXPIRE AFTER '30days'::INTERVAL
AS SELECT app_name, url,
date_bin('1m'::INTERVAL, time_window) AS time_window_1m,
uddsketch_merge(cost_sketch) AS cost_sketch_1m
FROM rpc_cost_10s_agg
GROUP BY app_name, url, date_bin('1m'::INTERVAL, time_window);10‑minute cold table:
CREATE FLOW rpc_cost_10m
SINK TO rpc_cost_10m_agg
EXPIRE AFTER '180days'::INTERVAL
AS SELECT app_name, url,
date_bin('10m'::INTERVAL, time_window_1m) AS time_window_10m,
uddsketch_merge(cost_sketch_1m) AS cost_sketch_10m
FROM rpc_cost_1m_agg
GROUP BY app_name, url, date_bin('10m'::INTERVAL, time_window_1m);UV approximate counting with HyperLogLog : Flow can compute per‑window unique visitor estimates using hll and query them with hll_count :
CREATE FLOW uv_hll_10s
SINK TO uv_state_10s
EXPIRE AFTER '12hours'::INTERVAL
AS SELECT app_name, url,
date_bin('10s'::INTERVAL, ts) AS time_window,
hll(user_id) AS uv_state
FROM access_log
GROUP BY app_name, url, date_bin('10s'::INTERVAL, ts); SELECT app_name, url, hll_count(uv_state) AS uv_count
FROM uv_state_10s
WHERE time_window = 1743479260;Benefits : Pre‑aggregation and multi‑level roll‑up cut query latency from seconds to milliseconds; tiered TTL (10 s → 1 day, 1 m → 7 days, 10 m → 180 days) controls storage cost; independent scaling of Frontend, Flownode, and Datanode provides resource decoupling; using standard SQL lowers the learning curve.
DeWu Technology
A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.