Big Data 10 min read

Design and Architecture of Youzan Unified Log Platform

The article details the design, components, and operational challenges of Youzan's unified log platform, describing its multi‑layer architecture, ingestion methods using rsyslog/logstash and Flume‑NG, Kafka‑based log center, processing pipelines with Storm/Spark, and storage in HDFS and Elasticsearch.

Architecture Digest
Architecture Digest
Architecture Digest
Design and Architecture of Youzan Unified Log Platform

From 2015, the author has been developing backend services at Youzan and recently took over the Track log platform, using this opportunity to share the architecture of Youzan's unified log platform.

Given the rapid growth of Youzan's services, the company generates massive logs—averaging 11,000 entries per second and peaking at 15,000, totaling about 9 billion logs daily (≈2.4 TB). To enable effective monitoring, maintenance, and analysis, a unified platform was built to collect, process, and store logs centrally.

The overall architecture consists of four layers: a log ingestion layer, a log center, a processing layer, and a storage layer. Logs are collected via two ingestion methods—(1) rsyslog plus Logstash, and (2) Flume‑NG—and are streamed into a Kafka cluster that serves as the log center. Downstream systems such as Track, Storm, and Spark consume the streams for real‑time analysis, while logs are persisted to HDFS for offline processing, indexed in Elasticsearch for query, and sent to Hawk for alerting and metric monitoring.

Log Ingestion Layer : Method 1 uses rsyslog to write logs to a local directory (local0) which Logstash reads and forwards to Kafka topics. Method 2 employs Flume‑NG, a distributed, highly‑available system with a three‑tier architecture (Agent, Collector, Store). The Agent comprises Source, Channel, and Sink components. In Youzan's deployment only the Agent layer is used.

Example log format (used by all sources): <158>yyyy-MM-dd HH:mm:ss host/ip level[pid]: topic=track.**** {"type":"error","tag":"redis connection refused","platform":"java/go/php","level":"info/warn/error","app":"appName","module":"com.youzan.somemodule","detail":"any things you want here"}

For PHP services a custom SDK wraps the log format and forwards logs to Flume; Java services use a Logback‑based TrackAppender; other languages (Go, Node.js, Python, etc.) can assemble logs in the same format and push them to Flume.

Log Center : Implemented with a Kafka cluster, chosen for its distributed nature, high throughput, message persistence, O(1) disk access, and configurable data retention. It retains recent logs (24 h) while persisting longer‑term data to HDFS.

Processing and Storage Layers : The processing layer consumes Kafka topics to perform tasks such as aggregating logs into Elasticsearch indices, detecting anomalies for alerting, computing metrics for monitoring, building call‑chains for performance analysis, and enabling user‑behavior analytics. The storage layer holds processed results in HDFS, Elasticsearch, and Hawk.

The author also discusses challenges faced when taking over the platform, including the need to develop log‑consumption SDKs, lack of documentation, learning unfamiliar components (Logstash, Flume, Kafka, Elasticsearch), environment setup, high Elasticsearch memory usage, and plans for future improvements like UDP support, HDFS integration, and log mining.

In conclusion, the sharing aims to help others understand the architecture and lessons learned from building a large‑scale, real‑time log platform at Youzan.

distributed systemsmonitoringBig DataKafkalog platformflume
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.