GooseFS: Accelerating Cloud Storage for Big Data and Data Lake Platforms
GooseFS, Tencent Cloud’s Hadoop‑compatible storage accelerator, adds a local NVMe‑SSD cache layer to cloud‑native data lakes, letting users boost query speeds by up to 46 % and cut backend bandwidth by 200 Gbps without code changes, as demonstrated by a music‑industry customer’s 200‑node deployment caching ten million files.
This article introduces GooseFS, a storage acceleration tool developed by Tencent Cloud's object storage team for next-generation cloud-native data lake scenarios. GooseFS provides Hadoop-compatible FileSystem interface implementation to address performance bottlenecks and network bandwidth costs in cloud-based big data/data lake platforms with separated storage and computation.
The article focuses on how a major music customer improved their big data platform efficiency using GooseFS, achieving significant cost reduction. The customer's BI data warehouse platform, built on COS/CHDFS, faced challenges with rapidly growing data access bandwidth (reaching 700Gbps) while needing to further increase read bandwidth and reduce computing resource costs.
GooseFS was deployed as a local acceleration cache layer using the customer's idle NVME SSD resources (approximately 500TB). The solution achieved a 46% peak query performance improvement and reduced backend bandwidth by 200Gbps. The article details GooseFS's core architecture including multi-level storage media (RAM, SSD, HDD), metadata management using RocksDB, high availability through Zookeeper and Raft, and Hive table/partition management capabilities.
Key features discussed include transparent acceleration allowing users to access data without code changes, distributed load pre-warming to prevent bandwidth spikes, and asyncCache optimization for indexed data files. The customer successfully deployed over 200 GooseFS nodes, caching nearly 10 million files, demonstrating significant performance and cost benefits for their big data platform.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.