JD's Big Data Cross‑Domain and Hierarchical Storage Practices
JD’s article details its big‑data platform’s cross‑domain and hierarchical storage solutions, describing the challenges of multi‑datacenter data synchronization, the architecture of its storage layer, the implemented asynchronous and synchronous data flows, topology management, metadata tagging, and performance‑enhancing techniques for efficient, disaster‑resilient data handling.
Managing massive data is a common challenge for many enterprises; to efficiently manage data and fully exploit its value, effective storage of big data is a prerequisite.
This article compiles JD's latest explorations and practical experiences in distributed and hierarchical storage of big data.
JD's data platform architecture consists of six parts, with data storage as the foundational component of the compute‑storage layer, supporting upstream compute engine scheduling and higher‑level tool, service, and application layers. In the overall architecture, the underlying storage acts as the infrastructure base of the big‑data platform.
1. Issues and Solutions for Cross‑Domain Storage
Before adopting cross‑domain storage, inter‑datacenter data synchronization relied on business‑side Distcp, which introduced several problems:
Metadata consistency depended on business guarantees, requiring costly and time‑consuming manual migration.
Uncontrolled inter‑datacenter traffic affected synchronization tasks, necessitating external scheduling and storage systems.
Redundant data copies across different datacenters increased sharing and synchronization costs.
Lack of multi‑datacenter cluster disaster‑recovery capabilities prevented full utilization of multi‑datacenter advantages.
To address these issues, JD designed a cross‑domain data synchronization function at the storage layer, ensuring data consistency, providing business‑transparent cross‑domain sync and sharing, reducing duplicate work, and enabling cross‑domain migration and disaster recovery.
The main idea of JD's cross‑domain storage architecture is "full storage + full‑network topology," achieving cross‑datacenter fault domains and ultimately providing off‑site disaster recovery and cross‑datacenter storage capabilities for critical big‑data.
2. Cross‑Domain Data Flows
Two data flow methods are employed:
Asynchronous flow: Data is first written to the local datacenter, then automatically synchronized across domains by the NameNode (NN). Write performance matches non‑cross‑domain scenarios, and synchronization latency outperforms the Distcp solution.
Synchronous flow: A pipeline connects all DataNodes (DN) across datacenters, synchronizing data in a single operation. This method targets workloads with high consistency and reliability requirements.
3. Topology Management and Datacenter Awareness
Topology management adds a datacenter dimension to node topology, and block selection logic adapts to the global topology to support multi‑datacenter environments.
For client‑side datacenter awareness, RPC headers can carry datacenter information for cross‑domain‑aware clients, while non‑aware clients use IP‑to‑datacenter mapping provided by JD's network service team.
4. Cross‑Domain Identifier
The identifier module uses attribute tags supporting replicas and Erasure Coding (EC) to describe a block's cross‑domain properties. EC blocks contain data and parity blocks; compared to replica mode, EC cross‑domain sync is more complex, requiring intra‑datacenter reconstruction and, when reconstruction conditions are unavailable, cross‑domain copying to reduce sync traffic.
5. Accelerating Cross‑Domain Processing
Persist metadata in XATTR.
Build Inode Proto in memory.
Create block attribute tags on each data block.
6. Cross‑Domain Block补 (补块) and Flow Control
Isolate cross‑domain processing from existing workflows, ensuring that new cross‑domain handling does not affect same‑datacenter block补 processes, preserving single‑datacenter metadata service availability during network interruptions.
Introduce an asynchronous cross‑domain updater combined with cross‑domain tags to enable HA switch‑over and continue block补, solving legacy data issues.
Replace the original DistCopy task with a CR‑Checker program, smoothly upgrading cross‑cluster sync tasks to cross‑domain sync tasks while minimizing impact on existing workloads.
2. Issues and Solutions for Hierarchical Storage
JD's hierarchical storage addresses problems of undifferentiated hot/cold data, heterogeneous hardware handling, and difficulty in advancing data governance.
The hierarchical storage architecture is implemented within the NameNode (NN) and includes:
Layered strategy configuration with external API delivery and internal settings.
Layered configuration API for offline analysis and business‑side policy submission.
Built‑in layering strategies, defaulting to LRU based on access monitoring.
Tag manager for directory and node tags, guiding block selection and distribution validation.
Data distribution validator to verify new data placement according to tags.
Bulk‑data satisfier to scan and validate existing data, guiding migration and enabling data lifecycle management.
Core Design of Hierarchical Storage
Two modules form the core design: metadata tags on the directory tree for hot/cold allocation, and a virtual multi‑topology tree that logically separates nodes by tag type, providing independent topologies for efficient node selection. The virtual topology updates asynchronously based on node weight or synchronously during node up/down events.
Processing differs for incremental and bulk data:
Incremental data: write requests evaluate tags, match appropriate nodes, and write data.
Bulk data: background distribution validation scans tags, matches nodes via the virtual topology, and performs migration or conversion.
The above content is excerpted from Wu Weiwei's lecture "JD Big Data Cross‑Domain and Hierarchical Storage Practices" and is included in DataFun's "Big Data Technology Application Cases Manual" (202207 issue). QR codes in the original article provide free access to the full manual and additional resources.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.