Artificial Intelligence 21 min read

Building and Applying a User Profile Tagging System: Practices and Insights

This article presents a comprehensive overview of constructing and deploying a user and item profiling tag system at Qunar, covering tag taxonomy, integration challenges, technical architectures, algorithmic methods such as classification, recommendation, knowledge‑graph and causal inference, as well as real‑time streaming, ID‑mapping, and practical applications in marketing, attribution and A/B testing.

DataFunSummit
DataFunSummit
DataFunSummit
Building and Applying a User Profile Tagging System: Practices and Insights

The session introduces the concept of a profile tag system, explaining why Qunar needed to consolidate independent tag schemas from multiple business lines into a unified framework to support strategic decision‑making.

It outlines the five main parts of the talk: the tag taxonomy, the tagging platform, common algorithmic tags, the construction process, and application scenarios, followed by a Q&A.

Tag taxonomy includes five categories: marketing & risk control, business analysis, user description, statistical tags, rule‑based tags, and model‑based tags. Each serves different business needs such as personalized recommendation, risk assessment, and multi‑dimensional KPI monitoring.

Tag platform (CDP) provides end‑to‑end services for tag generation, data analysis, business application, and effect evaluation. The platform supports both offline batch processing and real‑time streaming (e.g., Flink, Spark) and stores tags in low‑latency stores like Redis or HBase.

Construction methods are divided into statistical (SQL‑driven), rule‑based (business‑logic defined by analysts), and model‑based (machine‑learning algorithms). Model tags may suffer from limited sample size and accuracy challenges, requiring careful validation.

Update cycles range from hourly, daily, weekly, monthly to true real‑time streaming updates, depending on tag type and business requirements.

Access patterns consider whether tags are needed online (real‑time) or offline, influencing storage choices and system performance.

ID Mapping resolves multiple device or account identifiers to a single user ID, which is crucial for risk control and accurate profiling.

Algorithmic tag families discussed include classification, recommendation, knowledge‑graph, causal inference, image‑tagging, NLP, and look‑alike algorithms, each illustrated with practical use cases such as hotel recommendation or marketing expansion.

Application scenarios cover marketing audience selection & expansion, business metric attribution analysis, and A/B experiment effectiveness analysis, demonstrating how tags drive data‑driven decision making.

The Q&A addresses common questions about the difference between user behavior and business logs, implementation of streaming tags (Flink, Spark, Python/SQL code), definition of real‑time tags, ID mapping strategies, and the lifecycle management of tags.

Overall, the talk provides a practical roadmap for building a scalable, algorithm‑rich profiling tag system that bridges data engineering, AI techniques, and business operations.

Data EngineeringAB testingmachine learninguser profilingTagging System
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.