Backend Development 18 min read

Designing a Disaster‑Recovery Data Backup System for JD’s LBS C‑End SOA Service

This article explores the design and implementation of a disaster‑recovery data‑backup architecture for JD’s LBS C‑end SOA service, covering backup strategies, cost‑reduction techniques, grid‑based indexing with H3, client‑side caching, diff verification, and deployment considerations to balance reliability, performance, and expense.

JD Tech
JD Tech
JD Tech
Designing a Disaster‑Recovery Data Backup System for JD’s LBS C‑End SOA Service

In service‑oriented architecture (SOA) systems, disaster‑recovery (DR) capability is essential for stability; multi‑data‑center deployment, automated failover, and data backup can greatly improve resilience. JD’s “秒送” front‑end is a critical traffic entry, and its DR design must handle both B2C and O2O scenarios, massive POI data, and cost constraints.

The article first outlines the problem background: O2O traffic grows rapidly, POI resources are divided into 3 km, 2 km, and 1 km fences, and data backup must distinguish strong‑real‑time and weak‑real‑time data while ensuring consistency and user experience.

Key challenges include the sheer scale of POI data (hundreds of millions of points), prohibitive storage costs (over 5 M CNY/month), and the difficulty of keeping backup data fully consistent with online data due to dynamic recommendation strategies.

To reduce storage, the authors propose a grid‑based approach using H3 hexagonal indexing. Two grid precisions (7 and 8) are generated, covering the same area as GIS grids but reducing the number of cells from billions to hundreds of thousands, cutting storage cost to a few thousand CNY per month.

Hot‑spot POIs are selected per grid based on user request density, and only these representative points are cached. Data is compressed with GZIP after converting to strings, achieving about 60 % space savings.

Backup data is stored in Redis; a switch determines whether the client’s localStorage cache (≈5 MB) or the server‑side backup should be used, based on cache freshness and availability. For the home and channel pages, full page data (~2.6 MB) can be cached client‑side, while the store‑detail page relies on server‑side backup due to larger product lists.

Verification (DIFF) is performed by requesting a set of POIs (seven per grid) and comparing backup responses with online results, aiming for >90 % similarity. Gray‑scale rollout, pressure testing, and monitoring ensure a smooth transition.

Finally, the article summarizes the overall DR solution, discusses additional measures such as service redundancy, automatic failover, load balancing, monitoring, BCP/DRP, and regular drills, and invites further discussion on extending the approach to fault‑tolerant mechanisms.

backendcost optimizationdisaster recoverydata backupLBSSOAgrid indexing
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.