Backend Development 12 min read

Memory Optimization for Core Services: JVM Tuning and Large Object Management

The article describes how vivo’s core music service reduced frequent garbage collections, long GC pauses, and RPC timeouts by switching to ParNew+CMS collectors, implementing fault‑transfer monitoring, optimizing large response objects, and adding non‑invasive payload checks, resulting in dramatically lower YGC/FGC frequencies and improved memory stability.

vivo Internet Technology

Aug 23, 2023

Memory Optimization for Core Services: JVM Tuning and Large Object Management

This article details the memory optimization process for a core music service at vivo, addressing JVM memory issues that were causing frequent garbage collection, long GC pauses, and RPC timeouts during peak periods. The service provides metadata and user asset queries for songs and artists.

Initial analysis revealed YGC occurring 12 times per minute (peaking at 24) with 327ms average duration, and FGC every 10 minutes (peaking at 1) with 30-second average duration. Memory usage showed abnormal spikes during problem periods, with heap memory rising rapidly and FGC becoming frequent with diminishing memory release.

The optimization process involved four key steps:

Step 1: JVM Optimization The default JVM configuration (Parallel Scavenge + Parallel Old) was replaced with ParNew+CMS garbage collectors, better suited for the service's characteristics of many short-lived objects and low throughput requirements. Key parameters were adjusted: young generation size increased to 1.5x original, CMSScavengeBeforeRemark enabled to reduce remark time. This significantly reduced heap memory usage.

Step 2: Fault Transfer Strategy A monitoring-based fault transfer mechanism was implemented. When API services encountered exceptions calling core services, problematic machine IPs were reported to the monitoring platform. Alert rules and callback functions were configured to automatically remove problematic IPs from the provider cluster, preventing further calls to faulty machines.

Step 3: Large Object Optimization Analysis of thread dumps revealed numerous threads stuck in Dubbo's encoding process, suggesting large response objects. Heap dumps identified a 258MB Netty taskQueue and individual 9MB responses. The problematic interface was found to query excessive information. Post-optimization, YGC total count decreased by 76.5%, high-peak cumulative time decreased by 75.5%, FGC occurred only once every three days, and high-peak cumulative time decreased by 90.1%.

Step 4: Non-invasive Memory Object Monitoring Inspired by Dubbo's payload checking mechanism, a custom codec was developed to monitor objects exceeding thresholds without impacting performance. The solution leveraged buffer position changes before and after encoding to detect oversized objects. For objects between the alert threshold and payload limit, direct size judgment and key information logging were performed. For objects exceeding payload, IDs were cached during retransmission and logged during the original transmission process.

The article concludes that memory optimization is an ongoing comprehensive process involving log analysis, monitoring tools, stack information, and various optimization strategies including scheduled tasks, code refactoring, and caching.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Development Garbage Collection Performance tuning fault transfer JVM Optimization large object handling

Written by

vivo Internet Technology

Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.