Microservice Tracing with Zipkin and StarRocks: Architecture and Practice
This article describes how Sohu Intelligent Media built a microservice tracing system using Zipkin for data collection and StarRocks for storage and analysis, covering architecture, data model, ingestion pipeline, SQL analytics, performance monitoring, and future improvements.
In a microservice environment, Sohu Intelligent Media uses Zipkin to automatically instrument services and collect tracing spans, which are sent via HTTP or Kafka.
Collected spans are stored in StarRocks, leveraging its high‑performance SQL engine for real‑time analytics, replacing traditional back‑ends such as MySQL, Cassandra or Elasticsearch.
The data model extracts fields like traceId, spanId, parentId, timestamps and tags, and adds time dimensions (dt, hr, min) for efficient aggregation.
Ingestion is performed with a StarRocks CREATE TABLE definition and a ROUTINE LOAD that parses JSON from Kafka, mapping fields and generating derived columns.
Key analytics are expressed as SQL queries, for example to compute per‑service request counts, latency percentiles, error rates, service topology, and performance bottlenecks using window functions and ROW_NUMBER() over trace data.
Flink can be used to resolve parent‑child relationships in spans before writing back to StarRocks, but future work aims to replace Flink with StarRocks UDAFs.
Production results show that over 30 services generate billions of trace rows daily, providing detailed monitoring, alerting and root‑cause analysis while simplifying deployment to a single SDK and Kafka configuration.
The system improves both observability and engineering efficiency, and future enhancements include native JSON support, richer tag handling, and UI improvements.
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.