Backend Development 13 min read

Remote Aware Load Balance (RALB) Algorithm for Search Recommendation System: Design, Implementation, and Performance Evaluation

This article presents the design and evaluation of the Remote Aware Load Balance (RALB) algorithm applied to JD’s search‑recommendation architecture, describing its CPU‑centric load‑balancing principles, implementation details, functional verification, throughput and boundary testing, and the observed improvements in CPU utilization and overall system performance.

JD Retail Technology

Jun 21, 2023

Remote Aware Load Balance (RALB) Algorithm for Search Recommendation System: Design, Implementation, and Performance Evaluation

Background: JD's search recommendation services use CPU‑adaptive throttling but client calls use round‑robin (RR) without considering server performance, leading to CPU imbalance.

RALB Overview: RALB (Remote Aware Load Balance) targets CPU balance by adjusting traffic weights based on real‑time server CPU usage reported via RPC.

Algorithm Goals: Equalize server‑side CPU usage; exploit linear relation between QPS and CPU to control load.

Algorithm Steps: 1) Distribute traffic by weighted random (wr). 2) Collect per‑server CPU every second via RPC. 3) Every 3 s recompute weights to balance CPU.

Metric Dependencies: Uses IP list, real‑time health, historical health, dynamic CPU target, and weight as inputs (table shown).

Weight Adjustment: Initialize weight to 10000, periodically update based on average cluster CPU, apply scaling factor (default 0.5) and limit weight changes to avoid extreme shifts.

Boundary Handling: Handles cases where a server receives no traffic (CPU reported as 0) and network failures (weight set to 0 until recovery).

Functional Verification: RALB deployed in the search‑recommendation cluster; after rollout, QPS distribution became layered and CPU usage converged across servers.

Throughput Tests: Compared RALB with RR under unlimited, partial, and full throttling. Results show RALB maintains CPU balance and achieves up to 7 % higher throughput at the critical transition point.

Test Data: Includes tables of QPS, CPU, TP99 for both algorithms and a Python script used to plot throughput curves:

import matplotlib.pyplot as plt
import numpy as np

x = [0,1,2,3,4,5,6,7,8,9,9.73,10.958,11.52,17.15,22.7]
y = [0,1,2,3,4,5,6,7,8,9,9.73,10.61,10.49,10.10,9.82]

w = [0,1,2,3,4,5,6,7,8,9.674,10.823,11.496,11.723,12.639,13.141,17.15,22.7]
z = [0,1,2,3,4,5,6,7,8,9.27,9.91,10.24,10.36,10.48,10.47,10.10,9.82]

plt.plot(x, y, 'r-o')
plt.plot(w, z, 'g-o')
plt.show()

Conclusions: RALB effectively eliminates CPU short‑board effects, provides stable latency, and improves overall cluster throughput, especially around the non‑limited to fully‑limited transition.

Deployment: After full rollout, server‑side QPS and CPU distributions became more uniform, confirming the algorithm’s production readiness.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

load balancing performance testing cpu-utilization

Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.