POI Recognition and Alias Linking in Travel Search: Challenges, Algorithmic Practices, and Online Impact
The article presents a comprehensive study of POI (point‑of‑interest) recognition and alias linking within travel search, detailing background challenges, a multi‑stage algorithmic framework, extensive offline experiments, and the resulting improvements in online conversion and relevance.
Introduction This article shares the topic of query understanding in the search domain, focusing on POI (point‑of‑interest) recognition and linking, with a case study of Feizhu (Fliggy) search.
01 Background and Challenges POI recognition is defined as identifying three categories—scenic spots, hotels, and landmarks—from user queries. Challenges include handling diverse user aliases (e.g., "小蛮腰" for "广州塔") and the low‑frequency nature of travel scenarios, which limits the availability of reliable behavior samples.
02 Algorithmic Practice The overall architecture consists of four layers: a domain‑pretrained model, alias mining, mention‑based recall, and a ranking module. Alias mining combines extraction‑based, behavior‑driven, and CETAR (a CIKM‑2022 paper) methods. The CETAR model uses a Context‑Enhanced encoder and an Abbr‑Recover decoder to predict possible abbreviations of POI names, leveraging both POI names and descriptive texts.
The domain‑pretrained model is trained on three tasks—MLM, SimCSE, and city‑level multi‑classification—using Feizhu’s queries, product titles, and POI names. Two model sizes are produced: a large model for offline cache and a small model for online inference.
Alias mining feeds a dictionary for mention recall, which is fused with a NER model’s predictions. Additional recall paths (segmentation and n‑gram) complement the alias‑based recall. The final disambiguation ranking uses a point‑wise binary classifier enriched with text features, NER, location, click/conversion, and similarity signals.
Extensive experiments on three datasets (two public, one internal) show that the proposed methods significantly outperform baselines in hit rate, accuracy, and novel alias discovery while reducing erroneous alias predictions.
03 Online Effect Deployed results are used across multiple scenarios, such as POI cards on recommendation pages and ID‑based recall for products. The system improves conversion rates and markedly reduces bad‑case ratios.
04 Summary and Reflections The full query‑understanding pipeline includes normalization, segmentation, correction, tagging, and weighting, with core modules for destination, POI, generic term, brand, and intent recognition. Continuous human review mitigates model drift, and data‑driven analysis guides prioritized optimizations.
Overall, the study demonstrates how a combination of domain‑specific pretraining, multi‑path alias mining, and robust ranking can enhance POI recognition in travel search.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.