Bilibili Tech
Apr 25, 2022 · Big Data
Optimizing Full Partition Tables with Zipper Tables, Hudi+Flink CDC, and Data Warehouse Strategies
Facing server‑hardware constraints, Bilibili’s data platform replaced wasteful full‑partition tables with a zipper‑table approach—preserving change history while cutting storage from petabytes to terabytes—and complemented it with Hudi + Flink CDC for near‑real‑time updates, dramatically lowering I/O, compute usage and latency.
Flink CDCHudiPartition Table
0 likes · 11 min read