Baidu Geek Talk
Sep 29, 2022 · Databases
Design and Challenges of TafDB: A Scalable Metadata Storage Engine for Cloud Data Lakes
TafDB, Baidu’s Spanner‑like distributed transaction database built on RocksDB and Multi‑Raft, provides a virtually unlimited metadata layer for cloud data lakes by unifying hierarchical and flat namespaces, minimizing cross‑shard transaction overhead, handling garbage collection, and employing a distributed clock, thus delivering trillion‑scale metadata capacity and tens of millions of QPS with low latency.
Distributed DatabaseNamespaceTafDB
0 likes · 21 min read