Artificial Intelligence 9 min read

Applying Learning to Rank for Search Suggestions at Gaode Maps

This article details how Gaode Maps leveraged machine‑learning‑based Learning to Rank to rebuild its search‑suggestion ranking pipeline, addressing challenges in sample construction, feature sparsity, and model optimization, and achieving significant improvements in relevance metrics and user experience.

DataFunTalk
DataFunTalk
DataFunTalk
Applying Learning to Rank for Search Suggestions at Gaode Maps

Gaode's vision is to connect the real world and make travel better, which requires intelligent linking of large‑scale LBS data and users through effective information retrieval, with search suggestions being a crucial component.

The article introduces the application of machine learning, specifically Learning to Rank (LTR), to Gaode's search‑suggestion service, focusing on model optimization and the practical benefits observed.

Search suggestion (the suggest service) automatically completes user queries or POIs in the input box, presenting candidate completions and ranking them intelligently.

Due to massive traffic and growing feature sets, rule‑based ranking became hard to maintain, prompting a shift to LTR, with Gradient Boosted Rank (GBRank) as the chosen model and pairwise loss.

Key challenges include constructing reliable training samples and optimizing the model; click data is sparse, biased, and does not fully reflect user satisfaction, especially for long‑tail queries.

To address sample construction, Gaode aggregates multiple server logs (suggest, search, navigation), segments sessions, and propagates the final click in a session back to all preceding queries, thus capturing holistic user intent.

Feature engineering covers four modeling needs: comparability across recall pipelines, dynamic POI relevance, supplementing sparse click features for low‑frequency queries, and regional personalization using geohash‑based segmentation.

After feature design, standard preprocessing (scaling, smoothing, bias removal, normalization) is applied before training the GBRank model.

Initial experiments showed a 5‑point MRR gain over rule‑based ranking, but feature importance was skewed toward a few features (e.g., city‑click), leaving many features underutilized.

To rebalance feature learning, two strategies were considered: oversampling sparse features (which alters data distribution) and adjusting the loss function. The latter was chosen, adding a penalty term (loss_diff) to the gradient of under‑used features, encouraging their selection in subsequent trees.

Mathematically, the modified loss adjusts the negative gradient of the cross‑entropy loss, effectively shifting the sigmoid function to increase the splitting gain for targeted features.

After loss adjustment and retraining, MRR improved an additional 2 points, and the coverage of ranking cases rose from 40% to 70%.

In conclusion, adopting LTR eliminated the need for rule‑based patches, delivering clear performance gains, and the GBRank model now satisfies ranking requirements across query frequencies.

Future work includes extending personalization, deep learning, vector indexing, and user‑behavior sequence prediction for search suggestions.

Big Datamachine learningfeature engineeringRankinglearning to rankGaode MapsSearch Suggestion
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.