Operations 19 min read

Designing Cost‑Effective Disaster Recovery Data Backup for LBS‑Based SOA Services

This article details a comprehensive disaster‑recovery strategy for LBS‑driven SOA services, covering challenges of massive POI data backup, cost‑reduction via grid indexing (H3), selective caching, compression, diff validation, client‑side fallback, and deployment processes to achieve reliable, low‑cost data availability.

JD Tech Talk

Aug 16, 2024

Designing Cost‑Effective Disaster Recovery Data Backup for LBS‑Based SOA Services

Background

In service‑oriented architecture (SOA) systems, disaster‑recovery capability is crucial for stability. Multi‑data‑center deployment, automated failover, and data backup improve resilience, but LBS (location‑based service) scenarios introduce new difficulties due to massive POI data and fine‑grained resource zoning.

Problem Statement

With the growth of the “秒送” business, traffic for LBS‑based user transactions has surged. The system must distinguish strong‑real‑time and weak‑real‑time data, understand the data production chain, and ensure backup authenticity while maintaining user experience.

Pain Points / Challenges

How to cache POI latitude‑longitude data at national scale (hundreds of billions of points).

How to reduce the enormous storage cost of disaster‑recovery data (initially >5 million RMB/month).

How to guarantee the effectiveness of cached resources while handling consistency and load pressure.

Industry Research

Investigation of a peer’s architecture revealed no explicit data‑backup at the SOA layer; they rely on lower‑level data redundancy, leaving the backup solution unclear.

Solution Ideation

We identified critical entry points (home page, channel page, store detail page) that block transaction flow and require backup. For home and channel pages, POI‑driven recommendation results are complex; we therefore propose using a grid‑based approach to reduce data volume.

Grid Construction

Adopt H3 hexagonal grid indexing (precision 7 ≈ 1.4 km) to approximate GIS coverage. This reduces POI count from billions to hundreds of thousands, cutting storage cost by >99%.

Cost comparison: precision 7 costs ≈ 2,973 RMB/month, precision 8 ≈ 14,133 RMB/month, versus the original >5 million RMB/month.

Hotspot POI Selection

Select the POI with the highest user request density within each grid as the backup representative.

Backup Frequency

Cache two data sets (daytime and nighttime) but prioritize daytime data due to lower night‑time traffic.

Store Detail Page Strategy

Cache store classification data (few hundred per store) and top‑two‑page product listings, excluding closed stores, reducing cost to ≈ 900 RMB/month.

Compression

Apply GZIP compression to all stringified data, achieving ~60% space savings.

Cost Summary

Home

Channel

Store Detail

Pre‑compression

1087 GB

3624 GB

300 GB

Post‑compression

652 GB

2174 GB

180 GB

Final Cost

1,956 RMB/month

6,522 RMB/month

540 RMB/month

Total monthly cost drops from >5 million RMB to ≈ 9,018 RMB.

Diff Validation

Validate backup accuracy by comparing seven POIs (six vertices + center) per hexagon against online results across 39 cities; aim for ≥90% consistency.

Implementation Process

Divide the solution into five modules: client interaction, grid service, task orchestration, gray‑release, and disaster‑recovery switch.

Client caches data in localStorage (~5 MB) for quick fallback; server decides when to use client cache versus Redis backup based on cache freshness.

Grid module uses JMF component to generate precision 7 and 8 grids and stores hotspot POI coordinates in MySQL for workers.

Task module handles asynchronous data generation, retries, alerts, and monitoring.

Gray‑release includes machine‑level, PIN‑level, store‑level, and city‑level rollouts.

Switching logic: if client cache is fresh, use it; otherwise fetch from Redis backup; if both missing, trigger fallback.

Results

Demonstrated successful home/channel page and store detail page rendering using the backup data, achieving the targeted cost reduction and reliability.

Reflection

Beyond data backup, future work will explore fault‑tolerance mechanisms such as automatic failover, multi‑active deployment, and comprehensive BCP/DRP practices.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

cost optimization disaster recovery data backup LBS grid indexing

Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.