Artificial Intelligence 14 min read

Advancements in Query Analysis for Map Search: City Analysis, Where‑What Segmentation, and Path Planning

The article details Amap’s upgraded map‑search query analysis, introducing a two‑stage city‑identification system, enhanced where‑what segmentation with CRF and GBDT models, a three‑stage path‑planning pipeline, and outlines future deep‑learning and knowledge‑graph enhancements for robustness and low‑frequency query handling.

Amap Tech
Amap Tech
Amap Tech
Advancements in Query Analysis for Map Search: City Analysis, Where‑What Segmentation, and Path Planning

In the previous article we introduced the overall evolution of geographic text processing at Amap and presented several generic query‑analysis points. This second part focuses on map‑specific text‑analysis techniques, including city analysis, where‑what segmentation, and path‑planning, and provides a brief outlook on future work.

4.1 City Analysis

Map search at Amap operates on a city‑level granularity. A complete search request contains the user‑entered query, the city displayed on the map, and the user’s current city. Accurately identifying the target city is the first and crucial step. The original city‑analysis module relied on rule‑based priors and posteriors, which suffered from poor effectiveness and maintainability, and used only query‑level features, leading to weak performance on low‑frequency queries.

The redesign treats city analysis as a two‑stage “recall + selection” problem. In the recall stage, features are extracted from both query and phrase granularity and candidate cities are merged. In the selection stage, a GBDT binary classifier ranks candidate cities. Samples are drawn randomly from search logs and manually labeled, with care taken to balance local and remote city distributions. Feature groups include query‑level, phrase‑level, and engineered composite features.

4.2 Where‑What Analysis

Map queries often contain both a spatial constraint (where) and a target POI (what). Correctly separating these components enables precise retrieval. The task consists of a prior stage (sequence labeling to split the query into where and what) and a posterior stage (intent selection, modeled as classification or ranking).

Problems identified include a simplistic CRF model that depends heavily on component‑analysis features, and rule‑based post‑processing that is hard to maintain. Improvements involve:

Developing a CRF analysis tool to visualize Viterbi paths and diagnose issues.

Enhancing features such as prefix confidence, suffix entropy, and what confidence, which quantify how strongly a phrase functions as a where or what segment.

Replacing rule‑based intent selection with a GBDT model that incorporates prior confidence and textual features.

Addressing robustness by building shallow ensemble models to handle query variations such as reversed order, misspelled districts, and unexpected token swaps.

These upgrades lead to comparable overall accuracy while significantly improving performance on adversarial and low‑frequency cases.

4.3 Path Planning

Path‑planning queries (e.g., “from X to Y”) are a classic NLP task. Early implementations used template matching, which handled most simple cases but could not scale to complex expressions like multi‑modal transit routes or indirect phrasing. The new pipeline adopts a three‑stage approach:

Keyword‑based pre‑filtering to quickly discard non‑path queries.

A CRF model for slot filling (origin, destination, transport mode) on the remaining candidates.

Post‑processing validation to ensure consistency.

Training data are generated automatically by enriching known patterns and substituting real origin/destination tokens, reducing the reliance on costly manual annotation. Features combine component‑analysis signals and POI‑keyword dictionaries. Evaluation on validation and random query sets shows clear gains in precision and recall, and the CRF‑based solution paves the way for future seq2seq multi‑task models.

5 Outlook

After two years of extensive machine‑learning deployment, geographic text processing has entered a “deep water” stage. Future work will focus on:

Attack side : Leveraging deep learning (e.g., seq2seq) to improve low‑frequency and long‑tail query handling, and integrating knowledge graphs to provide human‑like prior judgments.

Defense side : Enhancing system robustness against atypical expressions, query reordering, and input errors through targeted optimizations and ensemble shallow models.

Map search, though a niche vertical, presents a full set of challenges. Continued adoption of state‑of‑the‑art techniques tailored to geographic text characteristics will drive smarter, more reliable search experiences.

machine learningNLPpath planningQuery Analysiscity detectionmap searchwhere-what
Amap Tech
Written by

Amap Tech

Official Amap technology account showcasing all of Amap's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.