Debugging Supervisor Log Rotation Issues on Linux
This article explains how to troubleshoot fragmented log files caused by the Supervisor process manager on Linux, covering configuration details, log‑rotation constraints, diagnostic commands, and the eventual resolution by restarting the supervisord service.
Supervisor is a Python‑based process management tool for Linux/Unix systems that can automatically restart processes when they unexpectedly terminate, eliminating the need for custom shell scripts.
The author encountered a situation where a service's logs were scattered across ten rotated files, making troubleshooting difficult. Example file listings were shown using ls -lh app-web-stderr* and ls -lh app-web-stdout* commands.
The relevant Supervisor configuration was examined, highlighting three key parameters:
<code>[program:app]
directory=/opt/app
command=/usr/bin/python3 -m run_web
process_name=%(program_name)s_%(process_num)02d
user=chrism
numprocs=5
autostart=true
autorestart=unexpected
startsecs=1
startretries=3
stopasgroup=false
killasgroup=false
redirect_stderr=true
stdout_logfile=/data/logs
stdout_logfile_maxbytes=50MB
stdout_logfile_backups=10
stderr_logfile=/data/logs
stderr_logfile_maxbytes=50MB
stderr_logfile_backups=10</code>The documentation warns that when log rotation is enabled, two processes cannot share the same log file, as this can corrupt the files. The author considered whether multiple processes writing to a single log file might be the cause.
After checking the Supervisor version (not the latest) and reviewing CHANGES.rst and related issue reports without finding a matching bug, the author used lsof to identify which processes had the log files open. The output showed that the supervisor process itself was holding the log files:
<code>$ sudo lsof draft-csc-test-stderr.log
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
supervisor 2xx8 root xxw REG x,xx 2431xx20 1416xx18 app-web-stderr.log</code>From this evidence, the author concluded that a bug in Supervisor caused it to improperly manage the log files, resulting in duplicate log entries across rotated files. Restarting the supervisord service resolved the issue, and the problem has not reappeared.
Restart command used:
<code># 重启服务
$ systemctl restart supervisor.service</code>Original article: https://www.escapelife.site/posts/ff8a0822.html
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.