Big Data 16 min read

Building and Managing Huolala's User Event Tracking System: Architecture, Governance, and Monitoring

This article details Huolala's user event tracking (埋点) system, covering its background, challenges, the construction of a four‑module management platform, backend SDK design, monitoring and quality assurance mechanisms, and future plans for service integration, data lineage, and governance optimization.

DataFunSummit

Nov 6, 2023

Building and Managing Huolala's User Event Tracking System: Architecture, Governance, and Monitoring

Background and Challenges Huolala's user event tracking data is a core asset for growth, product optimization, and decision‑making. Rapid business expansion exposed three main pain points: lack of demand control on the business side, missing backend reporting, and high QA regression costs.

Capability Building To address these issues, a custom event‑tracking management platform was built with four modules—demand management, version management, event management, and attribute management. The platform standardizes event design, tracks versioned metadata, and links events to their originating requirements, enabling traceability and controlled incremental tracking.

Backend SDK and Data Pipeline A backend SDK reports events via internal HTTP to a collection service, which retrieves metadata from Redis for validation, tagging, and distribution. Validated events are dispatched to downstream Flink jobs and stored in Doris and Hive for analysis. An ACK and retry mechanism ensures data consistency.

Monitoring and Assurance Incremental and stock event monitoring provides visibility into PV/UV trends and data quality. A dedicated quality monitoring system (the "Dayu" system) visualizes core event health, while real‑time dashboards track error rates and version changes. Events are classified into four governance levels (A‑invalid, B‑general, C‑important, D‑core) to guide filtering and storage decisions.

Future Outlook The roadmap focuses on three areas: integrating self‑built services with the purchased analytics platform, establishing data lineage for downstream traceability, and advancing from passive to proactive governance to standardize event definition across teams.

Q&A Highlights The session covered evaluation of event accuracy, handling of invalid schema data via a separate Kafka topic, linking anonymous and logged‑in user actions through ID replacement, granularity of event and attribute versioning, and tools for quickly locating event implementations for analysts.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

monitoring data pipeline event tracking Data Governance backend SDK

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.