Artificial Intelligence 16 min read

Query Understanding and Intent Recognition in Search: Methods, Taxonomy, and Applications

This article explains how query understanding (QP) transforms user search queries into structured semantic blocks and intent categories using rule‑based NLP, entity recognition, and post‑processing, and describes its taxonomy, implementation details, and practical impact on search engine results.

HomeTech
HomeTech
HomeTech
Query Understanding and Intent Recognition in Search: Methods, Taxonomy, and Applications

In the previous article we briefly introduced entity recognition technology in various scenarios; this article focuses on query understanding (also called query analysis or QP) and how it is applied in the search engine of the "Home" platform.

Query understanding analyzes a user’s query to infer intent, such as answering a specific question (e.g., "Which country is BMW from?") or returning a set of results that satisfy a condition (e.g., "Cars priced between 30w and 50w"). The system first parses the query into semantic blocks and then uses these blocks to drive downstream business logic.

The most intuitive output of QP is a semantic structure. For example, the query "How much does repairing a Volkswagen engine cost?" is decomposed into the semantic blocks "Volkswagen" + "engine" + "repair" + "price", where the final intent is identified as "repair price".

Through extensive analysis of search logs, the team has defined a taxonomy of 70 major intents and over 200 minor intents, covering vehicle attributes, generic attributes, reputation, car selection, purchase, maintenance, encyclopedia, modification, sales, finance, insurance, usage, and more.

Intent recognition follows a two‑stage pipeline: business rule processing and model classification. The rule‑based component uses a set of handcrafted rules that combine entity, predicate, and basic‑word dictionaries. Each rule represents a reduction operation, similar to expert‑system inference, and may have a priority value. Example rules are shown below:

A:-100:CarSeries+Part=Part // car series + part → part

B:Repair+Part=Repair // repair + part → repair

C:Repair+Price=Price // repair + price → price

D:CarSeries+Repair=Repair // car series + repair → repair

The rule language supports operators such as "+" (concatenation) and "|" (alternation), and rules can be assigned different priority levels to control execution order.

Ambiguous entities (e.g., "mini" can be a brand, manufacturer, or car series) generate multiple recognition paths. For instance, the query "X5 cost performance how" yields the path "CarSeries+Cost_Performance+(-1)+How_Words". Multi‑path results are later reduced by the rule engine.

After rule reduction, if a unique intent node is not obtained, a post‑processing stage selects the best path and node based on heuristics such as path length, number of unknown tokens, and priority, ultimately producing a single intent label.

The resulting intent information feeds two major downstream components: (1) the "magic‑box" that provides pre‑processed, highly relevant answers, and (2) the ranking module that enriches feature vectors for search result ordering. The QP system is already deployed on the Home search homepage and the car‑search homepage, meeting real‑time performance requirements.

In conclusion, query understanding serves as a comprehensive layer that integrates entity recognition, rule‑based inference, and intent taxonomy to deliver smarter search experiences, and the authors invite further discussion and improvement suggestions.

search engineNLPquery understandingknowledge graphintent recognitionrule-basedsemantic parsing
HomeTech
Written by

HomeTech

HomeTech tech sharing

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.