Artificial Intelligence 21 min read

Building and Applying a User Profile Tagging System: Practices and Insights

This article presents a comprehensive overview of constructing and deploying a user and item profiling tag system at Qunar, covering tag taxonomy, integration challenges, technical architectures, algorithmic methods such as classification, recommendation, knowledge‑graph and causal inference, as well as real‑time streaming, ID‑mapping, and practical applications in marketing, attribution and A/B testing.

DataFunSummit

Jul 5, 2024

The session introduces the concept of a profile tag system, explaining why Qunar needed to consolidate independent tag schemas from multiple business lines into a unified framework to support strategic decision‑making.

It outlines the five main parts of the talk: the tag taxonomy, the tagging platform, common algorithmic tags, the construction process, and application scenarios, followed by a Q&A.

Tag taxonomy includes five categories: marketing & risk control, business analysis, user description, statistical tags, rule‑based tags, and model‑based tags. Each serves different business needs such as personalized recommendation, risk assessment, and multi‑dimensional KPI monitoring.

Tag platform (CDP) provides end‑to‑end services for tag generation, data analysis, business application, and effect evaluation. The platform supports both offline batch processing and real‑time streaming (e.g., Flink, Spark) and stores tags in low‑latency stores like Redis or HBase.

Construction methods are divided into statistical (SQL‑driven), rule‑based (business‑logic defined by analysts), and model‑based (machine‑learning algorithms). Model tags may suffer from limited sample size and accuracy challenges, requiring careful validation.

Update cycles range from hourly, daily, weekly, monthly to true real‑time streaming updates, depending on tag type and business requirements.

Access patterns consider whether tags are needed online (real‑time) or offline, influencing storage choices and system performance.

ID Mapping resolves multiple device or account identifiers to a single user ID, which is crucial for risk control and accurate profiling.

Algorithmic tag families discussed include classification, recommendation, knowledge‑graph, causal inference, image‑tagging, NLP, and look‑alike algorithms, each illustrated with practical use cases such as hotel recommendation or marketing expansion.

Application scenarios cover marketing audience selection & expansion, business metric attribution analysis, and A/B experiment effectiveness analysis, demonstrating how tags drive data‑driven decision making.

The Q&A addresses common questions about the difference between user behavior and business logs, implementation of streaming tags (Flink, Spark, Python/SQL code), definition of real‑time tags, ID mapping strategies, and the lifecycle management of tags.

Overall, the talk provides a practical roadmap for building a scalable, algorithm‑rich profiling tag system that bridges data engineering, AI techniques, and business operations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Data Engineering AB testing Machine Learning user profiling Tagging System

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.