Artificial Intelligence 16 min read

Ctrip's Marco Polo Platform: AI‑Driven Content Generation, Semantic Matching, and Productization

The article details Ctrip’s Marco Polo content platform, describing its data, algorithm, and functional layers, and explains how AI techniques such as NLP, semantic matching, named‑entity recognition, and image classification are applied to automate product‑centric content mining, article generation, quality rating, and first‑image selection.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Ctrip's Marco Polo Platform: AI‑Driven Content Generation, Semantic Matching, and Productization

The author, Sun Zhe, a senior algorithm engineer at Ctrip, introduces the concept of "contentization"—the integration of theme, product, and content to meet user interests and drive clicks.

The Marco Polo platform serves as Ctrip’s content‑centered middle‑office, comprising three layers:

Data layer: aggregates product dimension tables, reviews, travel notes, and photo data.

Algorithm layer: includes NLP (sentiment analysis, text matching, generation, entity recognition/linking) and image processing (beauty scoring, classification, deduplication).

Platform function layer: offers theme discovery, product‑based contentization, article‑based contentization, and content diversity.

Product‑based contentization automatically extracts theme‑related images and high‑quality texts for each product (e.g., a hotel) based on predefined keywords such as "parent‑child".

Article‑based contentization involves three steps: theme article mining, article rating (filtering low‑quality items using length, image count, entity recognition, sentiment), and automatic product tagging by recognizing entities like scenic spots, hotels, and restaurants.

Content diversity is achieved by generating micro‑travel notes that expand the variety and volume of textual content through a material library, article‑frame templates, and semantic de‑duplication.

Theme content mining relies on semantic matching models. Parallel models use Siamese networks for independent text embeddings, while interactive models compute cross‑attention matrices before CNN scoring. A hybrid architecture combines LSTM+attention and CNN, achieving >90% accuracy in supervised matching.

Article auto‑tagging uses LSTM+CRF for fine‑grained named‑entity recognition (e.g., specific hotels or attractions) followed by a coarse‑to‑fine classification and entity linking pipeline.

Entity linking extracts core terms, recalls candidates via 2‑gram matching, and re‑ranks them using lexical similarity, semantic relevance, and popularity features, improving recall by ~30% and accuracy by ~6%.

Theme image mining employs a hierarchical tag system (~200 tags) and maps user‑defined themes to relevant image categories, enabling automatic image retrieval.

Quality rating filters low‑quality articles using deep classifiers (FastText, TextCNN, LSTM+attention) and BERT, with BERT providing a 3‑5% boost in sentiment and entity tasks.

First‑image selection combines an aesthetic binary classifier (based on Inception‑v3 fine‑tuned) with category‑specific image types (e.g., façade, pool) and a triplet‑loss matching model to remove duplicates, improving operational efficiency by three‑quarters.

The article concludes that AI models significantly reduce manual effort and accelerate content operations, while noting remaining challenges such as nuanced semantic generation, lack of user feedback data, and the need to integrate extraction with generation.

image classificationAINLPsemantic matchingcontent recommendationNamed entity recognitionCtrip
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.