Building a User Behavior Data Collection and Analysis System (Hunyi) – Frontend Team Experience
This article describes how the frontend team designed and implemented a comprehensive user behavior data collection and analysis platform, covering its business value, overall architecture, SDK-based data gathering, event interception, processing pipelines, analytics dashboards, and practical insights for product and operations teams.
The article begins with an introduction by the speaker, a frontend engineer who has been responsible for building a user behavior collection and analysis system since last year, and outlines the motivation for creating such a system to provide precise metrics for operations, product, design, and development teams.
It explains the business value of data tracking: reducing support volume, identifying high‑value resource slots, defining target customers, and supporting data‑driven product improvements.
The system, named "Hunyi," has been running for 1.5 years, logging 1.6 billion events (about 10 million per workday) with a total development effort of roughly 90 person‑days.
The overall architecture consists of three steps—data collection, data processing, and data presentation—implemented through four functional modules: a collection SDK, processing and storage services, a Chrome plugin for slot‑level visualization, and a web portal for system‑wide dashboards.
Data is collected from PC, H5, and APP clients via two SDKs that automatically capture page entry/exit, scroll, and click events; custom events require minimal code injection. The SDK injects project and page IDs during the build process, enabling zero‑code automatic reporting of page lifecycle events.
For click events, the target DOM element (the "slot") and its container (the "block") are identified, and developers can use a provided tool to generate the necessary instrumentation code, which is then pasted into the desired location.
Event interception is performed by delegating four key events to the document level, filtering to retain only those that contain both slot and block IDs, thus reducing noise and converting a full‑stack invasive approach into a more precise full‑event tracking solution.
Data is sent using ** tags to ensure browser compatibility (fallback to alternative methods when sendBeacon is unavailable) and transmitted via CORS POST requests to avoid payload size limits.
On the processing side, Alibaba Cloud Log Service provides real‑time consumption, indexing, and querying capabilities, allowing the team to compute basic metrics (PV, UV, clicks, exposures) and construct funnel models for conversion analysis.
The analytics layer aggregates events into dashboards that display heatmaps, user paths, and confidence‑interval‑adjusted session durations, helping stakeholders identify performance bottlenecks and user behavior patterns.
Finally, the platform offers four major analysis dimensions—event analysis, page analysis, conversion analysis, and user analysis—each with customizable metrics and visualizations, supporting both internal decision‑making and external product improvements.
The article concludes with a brief note that the system is extensible, suggests third‑party alternatives for teams that cannot build their own, and includes a recruitment call for frontend engineers to join the ZooTeam.
政采云技术
ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.