How FlinkSQL Optimizations Cut CPU Usage by Up to 60% in Streaming Jobs
This article details the FlinkSQL performance enhancements implemented by the streaming team, covering view reuse, redundant shuffle removal, multiple‑input operator redesign, long sliding‑window optimizations, and native JSON format improvements, which together deliver up to 60% CPU savings and massive core‑hour reductions.
Background and Motivation
Amid a push for cost reduction and efficiency, the streaming team optimized FlinkSQL to meet higher performance demands. Flink SQL is widely used internally, with over 30,000 streaming tasks consuming millions of cores.
1. Engine Optimizations
View Reuse
In streaming SQL, common logic is placed in logical views, which are not materialized. Multiple downstream branches cause the view’s computation to be duplicated, leading to unnecessary resource consumption. The root cause is that Calcite converts each view query into a separate RelNode tree, preventing reuse.
Solution: Modify Calcite’s SqlToRel conversion to return a
LogicalTableScanreferencing the catalog view, and store the
CatalogViewso downstream operators share the same RelNode tree. This eliminates duplicate computation and yields a 20% CPU gain.
Remove Redundant Streaming Shuffle
Streaming shuffle (exchange) incurs serialization and network overhead. When two operators share the same hash key, a second shuffle is unnecessary.
Approach: Extend batch‑shuffle optimization rules to streaming, checking node distribution requirements and inserting exchanges only when needed. This reduces extra hash shuffles, achieving a 24% CPU improvement.
Streaming Multiple‑Input Operator
To avoid multiple shuffles in join‑agg‑union patterns, the team introduced a MultipleInput operator that merges several inputs into a single operator, handling state, timers, watermarks, and checkpoints collectively. This optimization provides a 10% CPU gain and resolves memory‑pressure issues.
Long Sliding‑Window Optimization
Long sliding windows (e.g., 7‑day or 30‑day windows) suffer from high overhead due to frequent pane merges. By storing intermediate results in global state and adding a
retractMergemethod to aggregation functions, the system can drop old panes and merge only new ones, reducing pane processing by up to 66% and delivering a 60% CPU reduction.
2. Data‑Format Optimizations
Native JSON Format
Approximately 13,000 tasks (≈70% of core usage) rely on JSON deserialization. The team switched to a vectorized C++ JSON parser (sonic‑cpp) and generated
BinaryRowDatadirectly, bypassing generic serialization. Tests show a 57% CPU gain for native JSON handling.
3. Tooling, Optimization, and Platform Layers
The tool layer provides real‑time SQL task metadata reporting, operator‑level monitoring, DAG compatibility checks, prioritized gray‑scale rollout, and data‑accuracy tracing, ensuring stable and accurate deployment of optimizations.
The optimization layer scales proven improvements across existing jobs and explores new techniques, while the engine & platform layer collaborates with product teams to enable default activation of vetted optimizations, delivering over 100,000 core‑hour performance gains.
Future work will continue to refine FlinkSQL, explore optimal state usage in joins, and advance native engine and batch‑stream fusion.
ByteDance Data Platform
The ByteDance Data Platform team empowers all ByteDance business lines by lowering data‑application barriers, aiming to build data‑driven intelligent enterprises, enable digital transformation across industries, and create greater social value. Internally it supports most ByteDance units; externally it delivers data‑intelligence products under the Volcano Engine brand to enterprise customers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.