Big Data 14 min read

Design and Implementation of Bilibili's Event Tracking (埋点) Analysis Platform

Bilibili’s unified event‑tracking platform now manages over 120,000 definitions and ingests billions of daily events, guiding product, operation and marketing decisions through a full‑lifecycle framework—design, collection, testing, storage, and analysis—while leveraging a Spmid naming scheme, protobuf models, ClickHouse for sub‑second queries, and visual dashboards, with future plans for dynamic TTL, automated DWD tables, and deeper AB‑testing integration.

Bilibili Tech
Bilibili Tech
Bilibili Tech
Design and Implementation of Bilibili's Event Tracking (埋点) Analysis Platform

Bilibili's product iteration heavily relies on data-driven decisions, and event tracking (埋点) serves as a critical source for algorithmic recommendation, channel placement, and business analysis. This article shares Bilibili's experience in designing a standardized event tracking system and building an end‑to‑end analysis platform.

The internal "North Star" tracking platform now manages over 120,000 event definitions and ingests more than a hundred billion events daily, providing a unified foundation for product, operation, and marketing teams to monitor key metrics such as acquisition, activation, retention, and conversion.

Platform framework : The platform covers the full lifecycle of an event – from design and management, through collection, testing, storage, query, analysis, to deprecation. The architecture diagram (see image) illustrates this seamless flow.

Two development stages :

Stage 1 – Event definitions were stored in Excel or Info documents, logs were sent to separate Hive tables, and downstream analysts used generic BI tools or ad‑hoc SQL for analysis.

Stage 2 – A unified reporting model and SDK were introduced, a dedicated visualization platform was built, and ClickHouse replaced Hive as the query engine, dramatically improving query speed.

Design and implementation : The platform abstracts the data flow into several core modules (see architecture image). Key components include:

Event design specification & management : Adoption of the Spmid (Super Position Model) naming convention (business.page.module.position.type) and structured management of event names, common attributes, type‑specific attributes, and private attributes.

Event model : An Event‑User‑Session model serialized via Protocol Buffers, capturing who, when, where, how, and what.

Naming rules (bullet list): index or camelCase, nouns only, reuse of common fields (e.g., topic_id, order_id), avoid special characters.

Reporting protocol : Common parameters, type‑specific parameters, and private parameters are wrapped in a JSON field for downstream parsing.

Metadata management : Supports sampling ratios, core‑event flags, custom routing, and attribute‑group templates, enabling dynamic sampling control and high‑priority queues for critical events.

Testing module : Mobile devices (iOS, Android, iPad) scan a QR code to connect; data is forwarded via Nginx to a Lancer gateway, buffered in Kafka, and visualized in real‑time. The testing UI links reported events with their metadata for quick validation.

Analysis modules :

Event analysis – includes event, funnel, retention, path, single‑user drill‑down, user segmentation, and custom SQL.

Data dashboard – stores analysis results for reuse, with caching and nightly refresh strategies to reduce load.

Query engine migration : Early queries on Hive often exceeded 10 minutes. After switching to ClickHouse, the same analyses complete in sub‑second latency, with >85 % of queries finishing within 30 seconds.

Example of a ClickHouse query used for event analysis:

--已过滤敏感信息
select
  AA.logDate,
  AA.flag,
  CAST(AA.indicator AS String) AS indicator,
  AA.private_source
from (
    select
      log_date AS logDate,
      'A' AS flag,
      CAST(SUM(pv) AS Float64) AS indicator,
      extended_field_values[indexOf(extended_field_keys,'source')] AS private_source
    from event_table_name
    where log_date BETWEEN '20211209' AND '20211215'
      AND event_id = 'event_id'
      AND app_id = xx
    group by log_date, extended_field_values[indexOf(extended_field_keys,'source')]
    order by SUM(pv) desc
    limit 5000
    union all
    B query
) AA settings max_execution_time = 150;

Funnel analysis also uses ClickHouse's windowFunnel function. Example:

--已过滤敏感信息
SELECT level,
       uniq(buvid) AS cnt
FROM (
    SELECT buvid,
           windowFunnel(86400)(ctimes,
               (event_id = 'event_id123'),
               (event_id = 'event_id456'),
               (event_id = 'event_id789')) AS level
    FROM event_table_name1
    WHERE log_date BETWEEN '20220915' AND '20220915'
      AND (event_id = 'event_id123')
      AND arrayExists(x -> splitByChar('`', x)[indexOf(extended_field_key, 'parent_area_id')] IN ('1'), extended_fields_value_1)
    UNION ALL
    SELECT buvid, event_id, ctimes FROM event_table_name1
    WHERE log_date BETWEEN '20220915' AND '20220915'
      AND (event_id = 'event_id456')
    UNION ALL
    SELECT buvid, event_id, ctimes FROM event_table_name1
    WHERE log_date BETWEEN '20220915' AND '20220915'
      AND (event_id = 'event_id789')
) GROUP BY buvid SETTINGS distributed_group_by_no_merge = 1) GROUP BY level;

Beyond event and funnel analysis, the platform provides drag‑and‑drop visual query building, data dashboards with caching and incremental refresh, and optimization of refresh strategies based on access logs to reduce server pressure.

Summary and future outlook : The platform now supports hundreds of Bilibili products, adding hundreds of new events weekly. Future work includes dynamic TTL lifecycle management for each event_id, automated generation of intermediate DWD tables, and deeper integration with AB testing and user‑tag systems to further unlock data value.

Software Architecturebig dataKafkaClickHouseevent trackingdata analytics
Bilibili Tech
Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.