Exploring and Practicing Apache Pulsar at vivo: Cluster Management, Monitoring, and Optimization
The vivo big‑data team details how they migrated massive real‑time workloads from Kafka to Apache Pulsar, describing cluster‑level bundle and ledger management, retention policies, a Prometheus‑Kafka‑Druid monitoring pipeline, load‑balancing tweaks, client tuning, rapid broker‑failure recovery, and future cloud‑native tracing and migration plans.
This article is based on a talk given by the vivo Internet Big Data team at an Apache Pulsar meetup, titled "Apache Pulsar in vivo: Exploration and Practice". It describes how vivo applies Pulsar for cluster management and monitoring in a production environment serving over 400 million smartphone users.
vivo’s distributed messaging middleware team provides high‑throughput, low‑latency data ingestion and queuing services for real‑time computing workloads such as app store, short video, and advertising. The daily data volume reaches the hundred‑trillion‑level scale.
System Architecture
The overall architecture consists of a data‑ingress layer (SDK direct connections), a messaging layer where Kafka and Pulsar coexist (Pulsar handles traffic at the trillion‑level), and a processing layer using Flink, Spark, etc. Kafka is deployed in multiple clusters (billing, search, log) with physical isolation of topics per business importance.
Why Pulsar?
Facing rapid traffic growth, Kafka showed limitations: increasing topic/partition count, difficulty in dynamic scaling, high metadata growth, and high hardware failure impact. Pulsar’s compute‑storage separation, stateless brokers, and BookKeeper storage address these issues, offering independent scaling of compute and storage.
Pulsar provides four subscription models (Exclusive, Failover, Shared, Key_Shared) and supports massive partition counts, real‑time load balancing, rapid scaling, fault tolerance, tiered storage, and cloud‑native deployment.
Cluster Management Practices
1. Bundle Management
Bundles control traffic distribution to brokers. Topics are hashed to bundles, and bundle metadata is stored in ZooKeeper. Key points:
The number of bundles affects load‑balancing granularity; more bundles give finer control.
Bundle metadata must be planned to avoid excessive ZooKeeper load.
Operations include unloading bundles (instead of whole namespaces) and splitting large bundles.
loadBalancerLoadSheddingStrategy=org.apache.pulsar.Broker.loadbalance.impl.ThresholdShedder
loadBalancerAutoBundleSplitEnabled=false
loadBalancerAutoUnloadSplitBundlesEnabled=false
loadBalancerCPUResourceWeight=0.0
loadBalancerMemoryResourceWeight=0.0
loadBalancerDirectMemoryResourceWeight=0.02. Ledger Rotation
Each topic partition writes to a Ledger for a limited time. Rotation triggers when any of the following conditions is met:
Minimum rotation time elapsed.
Maximum rotation time elapsed.
Maximum number of entries per ledger.
Maximum ledger size.
managedLedgerMinLedgerRolloverTimeMinutes=10
managedLedgerMaxLedgerRolloverTimeMinutes=240
managedLedgerMaxEntriesPerLedger=50000
managedLedgerMaxSizePerLedgerMbytes=2048Proper tuning prevents oversized ledgers and uneven disk usage.
3. Data Retention & Deletion
Four stages are described: un‑acked messages, acked messages (with TTL), retention policy, and physical deletion via BookKeeper’s GC. Configuration examples:
minorCompactionInterval=3600
minorCompactionThreshold=0.2
majorCompactionInterval=86400
majorCompactionThreshold=0.8Best practice: align TTL with retention period to simplify management.
Monitoring Stack
The monitoring architecture uses Prometheus to scrape Pulsar metrics, forwards formatted metrics to Kafka via Prometheus remote‑write, and then Druid consumes Kafka data as a Grafana data source. Key metric categories include client‑side metrics, broker‑side metrics (topic traffic, broker load), and bookie‑side metrics (read/write latency). Additional custom metrics cover bundle distribution, P95/P99 broker latency, and network load per request.
Load‑Balancing Optimizations
Case study: 1 topic, 30 partitions, 180 bundles caused high traffic imbalance and frequent bundle unloads. By increasing partitions to 120, traffic became more evenly distributed, unload frequency dropped from >200 per day to ~10, and client connection churn reduced.
Client Configuration & Performance
Important client parameters:
memoryLimitBytes=209715200
maxPendingMessages=2000
maxPendingMessagesAcrossPartitions=40000
batchingMaxPublishDelayMicros=50
batchingMaxMessages=2000
batchingMaxBytes=5242880
After increasing partitions, the effective maxPendingMessages per partition dropped (40000/120 ≈ 333), causing a noticeable throughput reduction until the client was restarted. Reducing batchingMaxMessages further improved performance by up to tenfold.
Handling Broker Failures
When a broker crashes, other brokers experience traffic drop due to blocked producer threads. Optimizations include moving the blocking point from ProducerImpl to PartitionedProducerImpl, maintaining separate lists of usable and unusable producers, and quickly re‑routing traffic to healthy brokers. This approach restored traffic flow rapidly after a kill‑9 event.
Future Outlook
vivo plans to build end‑to‑end tracing from producers to consumers, integrate big‑data components (Flink, Spark, Druid) more tightly, migrate remaining Kafka traffic to Pulsar via KoP, and adopt container‑native deployments to fully leverage Pulsar’s cloud‑native capabilities.
vivo Internet Technology
Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.