Design and Implementation of a Home‑Page Recommendation System Using Reinforcement Learning and DPP
This article presents a comprehensive design for Zhuanzhuan's home‑page recommendation pipeline, detailing the system architecture, challenges of traffic efficiency and diversity, and a two‑stage solution that applies Proximal Policy Optimization reinforcement learning in the re‑ranking module and Determinantal Point Process optimization in the coarse‑ranking and traffic‑pool stages, followed by offline simulation, online deployment, and evaluation metrics.