Artificial Intelligence 24 min read

Hotel Recommendation System Architecture, Models, and Evaluation at Ctrip

This article presents a comprehensive overview of Ctrip's hotel recommendation system, covering its technical architecture, data processing pipelines, various ranking and embedding models—including FM, Wide&Deep, DeepFM, and FTRL—deployment methods such as PMML and TensorFlow Serving, offline and online evaluation results, and challenges like cold‑start and diversity.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Hotel Recommendation System Architecture, Models, and Evaluation at Ctrip

The core demand of internet companies is growth, and in the mobile‑internet era online travel platforms also rely heavily on algorithmic and model‑driven growth. Recommendation systems have become essential for helping users discover relevant hotels while increasing user retention and conversion rates.

In hotel recommendation scenarios, multiple recommendation scenes exist (city‑wide hotel ranking, nearby similar hotels, cross‑recommendation from flight pages, etc.). The recommendation problem can be formalized as a function f(U, I, C) that predicts a user's preference for a candidate hotel based on user, item, and context features, and then sorts candidates accordingly.

The system architecture consists of two main parts: the data layer and the model layer. The data layer handles offline batch processing (Spark, Hive) and real‑time stream processing (Storm, Flink, Spark Streaming) to collect and transform user, hotel, and contextual features, storing them in feature stores such as Redis. Offline features are generated from user profiles and hotel attributes, while real‑time features are extracted from user click streams.

The model layer includes recall, ranking, and re‑ranking stages. Recall quickly filters a large candidate set using simple rules or lightweight models; ranking applies sophisticated models (e.g., deep neural networks) to produce a fine‑grained order; re‑ranking adjusts the list for freshness, diversity, etc. Training, evaluation, and online A/B testing are integral to the model lifecycle.

For online deployment, two mainstream approaches are used: converting models to PMML and serving them via a Java‑based SOA framework, and using TensorFlow Serving. PMML conversion is performed with sklearn2pmml for scikit‑learn models or JPMML‑XGBoost for XGBoost models. TensorFlow models are saved as a SavedModel directory containing variables and saved_model.pbtxt , then loaded by Java or C++ clients.

Embedding techniques are heavily employed in the recall layer. Word2Vec's Skip‑Gram model is trained on sequences of hotels clicked within the same user session, producing dense vectors for hotels. Similarity is computed via cosine distance, and top‑N similar hotels are recalled. Graph‑based embeddings such as DeepWalk are also explored by performing random walks on a hotel‑item graph and feeding the generated sequences into Word2Vec.

Several ranking models are described:

Factorization Machines (FM) model the interaction between any two features using a low‑rank factorization.

Wide & Deep combines a linear (wide) part for memorization with a deep neural network for generalization.

FTRL (Follow‑the‑Regularized‑Leader) is used for online training of the wide part, while AdaGrad optimizes the deep part.

DeepFM merges FM and a deep network, sharing the same embedding layer, thus learning low‑ and high‑order feature interactions end‑to‑end.

Offline evaluation on a dataset of ~1 million samples shows AUC improvements from 0.597 (Logistic Regression) to 0.80 (DeepFM). Online A/B testing comparing a rule‑based baseline with an XGBoost‑driven model plus collaborative‑filtering recall yields a 25.35 % increase in click‑through rate and a 27.31 % rise in order volume.

Beyond modeling, the system must address cold‑start (new users/items) and recommendation diversity. Common solutions include rule‑based feature enrichment, active learning, transfer learning, and bandit algorithms such as Thompson Sampling and UCB for exploration‑exploitation trade‑offs.

The article concludes with references to key literature on deep learning‑based recommendation systems, factorization machines, and wide & deep learning.

machine learningdeep learningrecommendation systemembeddingonline inferenceCtriphotel
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.