Backend Development 16 min read

Recommendation Engine Upgrade Path, Architecture, and Performance Optimization for the "Guangguang" Content Community

The article details Guangguang’s shift from a rule‑based, Hive‑driven recommendation pipeline to an algorithmic service that leverages Elasticsearch and Redis for multi‑source recall, coarse and fine model ranking, exposure filtering, cold‑start handling, latency optimizations, reliability monitoring, and future vector‑based enhancements.

HelloTech
HelloTech
HelloTech
Recommendation Engine Upgrade Path, Architecture, and Performance Optimization for the "Guangguang" Content Community

Introduction – "Guangguang" is a content community of the Hello (哈啰) APP that provides lifestyle guides. This article uses Guangguang as a case study to illustrate the evolution of its recommendation system.

What is a Recommendation Engine? – A recommendation engine is an information‑filtering system that operates without explicit user intent. Unlike search, it discovers items a user may be interested in and ranks them for a specific scenario.

From Rule‑Based to Algorithmic Recommendation – The original system relied on Dataman‑driven Hive jobs that exported post, user‑behavior, and user data to Hive tables, then merged those tables into MySQL/PostgreSQL. This rule‑based approach produced a static, "one‑size‑fits‑all" list.

The new system introduces an algorithmic service that receives user requests, pulls data from Elasticsearch (ES) and Redis, and applies a ranking model deployed on a decision‑flow platform. ES handles complex queries; Redis offers fast look‑ups. Offline jobs compute item quality scores, tags, and train the ranking model, which is periodically refreshed.

Four‑Step Recommendation Process Recall: multi‑source data retrieval from ES and Redis. Coarse Ranking (粗排): rule‑based, high‑throughput filtering. Fine Ranking (精排): model‑based, higher quality but slower. Re‑ranking: business rules (sliding‑window, weight distribution) to diversify results and avoid user fatigue.

Exposure Filtering – To prevent duplicate recommendations, user‑viewed post IDs are stored in Redis. Two keys are kept: a real‑exposure list and an interface‑exposure list (with rolling expiration). The union of both lists is used to filter recalled items.

Cold‑Start Issues – User cold‑start is mitigated because most users already have profiles from other Hello services. Item cold‑start is addressed by: New‑item recall strategies. "Flow‑pool" (流量池) with bandit algorithms to give exposure to fresh items. Default feature values for unseen items.

Performance Optimization – Initial multi‑threaded recall sent a separate thread per source, causing a "bucket‑effect" and high latency under load. The optimized design uses: Elasticsearch multisearch to combine LBS, tag, and follower recall into a single request. Redis pipeline to batch calls. This reduced average request latency to under 400 ms.

Reliability & Monitoring – Multiple fallback mechanisms are in place for recall, ranking, and external service failures. Alerts are generated via Argus when recall counts drop or SOA errors exceed thresholds. Grafana dashboards visualize per‑step latency and overall PV‑CTR/UV‑CTR improvements.

Future Plans – Introduce vector‑based recall (ANN search) to diversify retrieval, add new storage for vector indexes, and evolve the recommendation service into a platform serving multiple business lines, reducing code duplication.

performance optimizationreal-timerecommendationrecommendation systemElasticsearchRedisCold Startbandit algorithm
HelloTech
Written by

HelloTech

Official Hello technology account, sharing tech insights and developments.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.