Artificial Intelligence 15 min read

Algorithm Optimization for Hotel Recommendation and Large‑Scale Discrete DNN Training at Ctrip

This article describes how Ctrip improved hotel recommendation by iterating from logistic regression to GBDT and deep neural networks, designing continuous and discrete features, adopting multi‑task learning with click and conversion signals, and building a large‑scale distributed DNN training and unified feature‑processing framework to boost model accuracy and engineering efficiency.

Ctrip Technology

Apr 9, 2021

Algorithm Optimization for Hotel Recommendation and Large‑Scale Discrete DNN Training at Ctrip

When users browse hotels on Ctrip, the platform must select appropriate hotel recommendations to reduce user effort; three typical scenarios are welcome ranking, intelligent ranking, and search‑compensation ranking, as illustrated in Figure 1.

The recommendation features are divided into user‑side, item‑side, and interaction features, further classified as continuous or discrete. Continuous features have good generalization but weak memorization, while discrete features provide strong memorization and high discrimination at the cost of weaker generalization.

The model pipeline evolved from Logistic Regression (LR) to Gradient Boosting Decision Tree (GBDT) and finally to Deep Neural Network (DNN). LR offers linear decision boundaries, interpretability, and incremental updates but limited accuracy; GBDT provides non‑linear boundaries and higher precision but cannot handle massive discrete features; DNN delivers highly non‑linear boundaries, supports large‑scale discrete embeddings, and achieves the highest accuracy, though it is less interpretable and more engineering‑intensive. A comparative table summarizes these characteristics.

In the hotel recommendation task, both click and conversion (CR) signals are modeled. Multi‑task learning is implemented via an ESMM‑style architecture: (1) discrete features are hashed into unique signatures and embedded; (2) pooling (sum for multi‑value features) and concatenation combine single‑value and multi‑value embeddings; (3) a multilayer perceptron predicts click and conversion using cross‑entropy loss, yielding significant gains over pairwise GBDT models.

To support these advances, Ctrip built a large‑scale discrete DNN training framework and a unified feature‑processing framework. The training platform, based on TensorFlow, supports any model architecture and any scale through an asynchronous parameter‑server backend, default hyper‑parameters, and horizontal scalability. The feature framework uses a protobuf schema shared by online and offline pipelines, ensuring identical input data and operator logic, thus eliminating online‑offline inconsistencies.

The combined algorithmic and engineering efforts have delivered noticeable improvements in hotel ranking performance, while future work will focus on model interpretability and deeper algorithmic exploration.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

feature engineering recommendation DNN large-scale training Ctrip hotel

Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.