Artificial Intelligence 12 min read

Algorithmic Strategies and Insights from Ctrip Hotel Ranking Team’s Participation in the 2018 ACM WSDM and RecSys Challenges

This article details the Ctrip Hotel ranking team's feature‑engineering and model‑innovation approaches—including session features, cold‑start mitigation, discriminative re‑weighting, and ensemble methods—that secured Top‑5 placements in the 2018 ACM WSDM and RecSys recommendation system competitions.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Algorithmic Strategies and Insights from Ctrip Hotel Ranking Team’s Participation in the 2018 ACM WSDM and RecSys Challenges

Author : Zhu Lin, senior algorithm engineer in Ctrip Hotel R&D's ranking algorithm group, holds a Ph.D. from University of Science and Technology of China and focuses on recommendation system algorithms.

Abstract : With the rapid rise of artificial intelligence and big‑data technologies, recommendation systems have become ubiquitous across domains such as movies, music, news, books, and more. This article presents the algorithmic strategies and lessons learned by the Ctrip Hotel ranking team during their participation in the 2018 ACM WSDM and ACM RecSys challenges, where they achieved Top‑5 results.

1. Competition Overview

2018 ACM WSDM Challenge – organized by ACM and KKBOX – required building a system to predict which songs a user would replay within a certain time frame, using a dataset containing user and song metadata, listening activities, and app information.

2018 ACM RecSys Challenge – organized by ACM and Spotify – aimed to automatically continue user playlists, providing a dataset of one million user‑created playlists and associated metadata, plus ten thousand incomplete playlists for evaluation.

2. Methodological Innovations

2.1 Feature‑Engineering Innovations

Beyond conventional categorical and statistical features, the team extracted temporal information from the sequentially ordered data, constructing item‑age features and session‑based features that capture the recency and co‑occurrence patterns of users, songs, artists, and composers.

Session features were derived by greedily grouping consecutive listening records of the same user into sessions, then computing counts of sessions per user, average songs per session, and session length.

For the RecSys challenge, playlist‑based co‑occurrence was modeled using a word2vec‑style embedding where playlists are sentences and songs are words, providing similarity features between songs.

2.2 Model Innovations

To address cold‑start issues caused by high‑cardinality categorical features, the team applied denoising auto‑encoders and dropout, training models without user‑id or song‑id features and later fusing them with the original models.

Improvements to item‑based collaborative filtering were introduced, including discriminative re‑weighting inspired by the SLIM algorithm, which learns sparse linear weights via an L2‑regularized SVM formulation to better capture feature importance.

Ensemble techniques combined collaborative‑filtering coarse‑ranking with Gradient Boosted Decision Trees (GBDT) for fine‑ranking, incorporating metadata‑derived features and the re‑weighted similarity scores.

Additional models such as Factorization Machines and deep neural networks were employed in the WSDM Cup to embed high‑cardinality IDs into low‑dimensional spaces, enhancing generalization to unseen categories.

3. Summary

Continuous feature exploration, problem‑specific algorithmic innovations, and effective model ensembles substantially improve recommendation quality, demonstrating the power of artificial‑intelligence techniques in delivering personalized travel product recommendations for Ctrip users.

Contributors: Chen Yihong and He Bowen from Ctrip Hotel R&D also contributed to this work.

feature engineeringcollaborative filteringRecommendation systemsSLIMCold Startmodel ensemble
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.