Optimizing Query Performance for JD's BIP Procurement System with JED, JimKV, and Elasticsearch
This article presents a comprehensive case study of how JD's procurement system (BIP) tackled massive data volume and complex query challenges by redesigning data models, introducing heterogeneous storage for inbound orders, leveraging JED and JimKV, and offloading complex searches to Elasticsearch, resulting in dramatically reduced database load and improved user experience.
Background: After years of iteration, JD's BIP procurement system faced severe database query performance issues due to complex queries, massive data growth, and a data model spread across more than 20 tables, leading to slow SQL, high load, and operational risks.
Challenges: Business users suffered poor order list and export experiences, especially with data skew; technical teams dealt with numerous slow SQLs, large tables causing DDL difficulties, and a fragmented data model that increased development and maintenance costs.
Goals: Improve core query and export performance, streamline unreasonable query conditions, reduce slow SQLs, lower database load, decrease table data size, and simplify the data model.
Solution Overview: The team introduced a heterogeneous storage strategy by separating "inbound" orders, marking them during order creation, and synchronizing them via binlog and JMQ to a dedicated JED table and JimKV cache. Non‑inbound orders were routed to Elasticsearch for complex queries, with monthly index partitioning to handle time‑range queries efficiently. Query routing logic was added to dynamically dispatch requests to the appropriate storage based on user, latency, and data scope.
Key Implementations: Data reduction: Identified that only 8% of orders required full fulfillment processing and isolated them as "inbound" orders, reducing overall real‑time order volume by 92%. Heterogeneous storage: Used JED for inbound orders and Elasticsearch for non‑inbound orders, with a unified data parsing module to ensure consistency. Query scheduling: Implemented a dynamic routing layer that considers user, latency, and data range to select the optimal cluster, and integrated a T‑1 data skew analysis to inform routing decisions. High availability: Deployed primary‑secondary Elasticsearch clusters with DUCC switches for seamless failover and added monitoring for order latency and MQ backlog.
Results: The system achieved stable switching with no online issues, reduced real‑time order volume by 92%, lowered database load, and improved query latency. The Elasticsearch cluster now supports multiple external query scenarios, and the overall architecture provides a clear path for further optimization.
Future Outlook: Plans include adding day‑level monitoring alerts, further simplifying the data model, decommissioning non‑core storage clusters, exploring additional MySQL load‑reduction techniques, and potentially replacing the inbound JED tables entirely with Elasticsearch as confidence grows.
JD Retail Technology
Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.