Backend Development 13 min read

Diagnosing Thread Blocking in a Spring Boot Service Caused by Logback Configuration Errors

This article details a step‑by‑step investigation of a Java Spring‑Boot service that suffered nightly response‑time alerts, revealing that misconfigured Logback file paths caused cross‑volume log rotation, thread blocking, and ultimately a production outage, and shows how gray‑deployment and environment fixes resolved the issue.

Sohu Tech Products
Sohu Tech Products
Sohu Tech Products
Diagnosing Thread Blocking in a Spring Boot Service Caused by Logback Configuration Errors

The investigation began when a colleague reported frequent midnight gateway alerts for a Java service using Spring and Tomcat, but snapshots and logs showed no obvious errors.

Metrics collection revealed that all service instances exhibited similar high Tomcat thread‑pool usage and response times, pointing to a systemic issue.

Further analysis of container metrics identified normal CPU, memory, and network usage, with only disk read activity standing out.

Clues were grouped into four categories: service‑specific, instance‑wide, time‑correlated, and operations involving locks or heavy disk reads. This narrowed the focus to log handling.

Logback configuration was examined and found to contain an undefined ${log.dir} variable, causing log files to be written to /data/logs while rotated backups were attempted in /opt/log.dir_IS_UNDEFINED . Because these directories reside on different storage volumes inside the Kubernetes pod, Logback’s RollingFileAppender performed a costly file‑copy during rotation, blocking threads.

The relevant source code shows the synchronized block around the triggering policy and the fallback rename‑by‑copy logic in ch.qos.logback.core.rolling.helper.RenameUtil.rename , which triggers the blockage when source and target are on different volumes.

A gray deployment was performed by adding two new pods with corrected log configuration while keeping the original pods unchanged. The new pods did not experience thread blocking, confirming the root cause.

Reproduction steps included creating a minimal Spring‑Boot web MVC project, applying the faulty Logback settings, simulating heavy logging, and pre‑filling log files to large sizes to trigger the issue.

Fixing the configuration by defining ${log.dir} to point to a valid directory on the same volume eliminated the copy operation, restored normal thread behavior, and resolved the production incident.

Additional observations explain why automatic JVM dump snapshots were ineffective: the blocked threads never reached the dump logic because they were waiting on the log rotation lock.

In summary, the outage was caused by an incorrect Logback environment variable leading to cross‑volume log rotation, which introduced thread blocking; correcting the variable and redeploying the service fixed the problem.

JavaKubernetesSpring BootLogbackperformance debuggingthread blocking
Sohu Tech Products
Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.