Artificial Intelligence 12 min read

TaskBot Task-Oriented Dialogue System: Intent‑Entity Joint Recognition and Dialogue Management

The article presents TaskBot, a modular task‑oriented dialogue robot that uses a Bi‑LSTM‑CRF joint intent‑entity model and a state‑machine based dialogue manager to handle multi‑turn conversations such as flight booking or rental housing, detailing its architecture, implementation, and performance.

58 Tech
58 Tech
58 Tech
TaskBot Task-Oriented Dialogue System: Intent‑Entity Joint Recognition and Dialogue Management

58.com’s Bangbang intelligent Q&A robot provides a complete solution including automatic Q&A bots, online human assistance, and smart chat, applicable to customer service, merchant, and sales IM scenarios. Among its offerings, TaskBot is a multi‑turn task‑oriented dialogue robot whose algorithmic practice is described in this article.

TaskBot Task Dialogue Overview

TaskBot aims to complete specific tasks (e.g., booking a flight) through multi‑turn interactions, as users often provide incomplete information in a single query. The system asks follow‑up questions to gather missing slots, exemplified by a flight‑booking dialogue.

In 58.com’s business scenarios (rental, purchase, recruitment, etc.), TaskBot must first recognize user intent (e.g., “I want to rent a house”) and extract entities such as region, layout, and price, then maintain and update dialogue state to decide the next response. The two core functions are intent‑entity joint recognition and dialogue management.

Overall Architecture

TaskBot can be implemented in a modular or end‑to‑end fashion. The modular approach, adopted by Bangbang, separates Natural Language Understanding (NLU), Dialogue Management (DM), and Natural Language Generation (NLG) into distinct components, which is more stable when data is scarce.

The NLU module identifies user intent and entities, DM selects a reply strategy, and NLG generates the textual response. Bangbang’s TaskBot adds a data layer (dialogue history and script configuration) and a logic layer (joint NLU service and dialogue management service).

NLU joint recognition runs on the WPAI AI platform using a bidirectional LSTM cascaded with a CRF tagger. The model first matches the query against intent rule templates; successful matches override the model’s intent output to quickly fix online bad cases.

The model embeds sentences, passes them through a Bi‑LSTM layer, then splits into an intent classifier and an entity tagger. Losses from both parts are summed for joint optimization. The entity tagger uses a linear‑chain CRF, modeling conditional probabilities of state sequences (IOBES tags) given observed character sequences. Viterbi decoding yields the most probable entity sequence.

After model prediction, an entity rule layer refines results; for example, fine‑grained location entities (city, district, business circle) are unified into a generic “location” entity to improve precision and recall, achieving over a 5 % gain in small‑sample scenarios.

Currently the joint model supports more than ten intent types (including abstract intents like inform, change, thank, goodbye) and dozens of domain‑specific entities for real‑estate and vehicle domains, reaching 90 % intent precision and 85 % entity precision on evaluation data.

Dialogue Management

The dialogue manager maps joint recognition results to system replies. It combines a finite‑state automaton with a frame‑based semantic approach, allowing stable state‑machine execution while retaining flexible intent‑entity‑driven strategies. A state transition diagram (shown below) illustrates how user inputs (intent, entities) trigger edges that move the conversation from one node to another.

During a single turn, the dialogue state tracker reads the current user state, selects a start node, and the strategy selector chooses a transition based on recognized intent/entities. The reply generator then produces the configured textual response.

Configuration data (state‑transition lists, triggers, parameters, outputs) are stored in WConfig, enabling gray‑release, version rollback, and real‑time callbacks. Integration with the SunDial ABTest platform allows rapid comparison of different script versions.

Conclusion and Outlook

This article described TaskBot’s intent‑entity joint recognition architecture and dialogue management implementation. TaskBot is already deployed in housing recommendation, weather query, and other scenarios, and future work includes knowledge‑graph reasoning and reinforcement‑learning‑based dialogue strategies to further improve understanding and response quality.

AINLPintent recognitionentity extractiondialogue systemDialogue ManagementTaskBot
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.