10 Logging Rules Every Backend Engineer Should Follow
This article shares ten practical rules for producing high‑quality logs in Java backend systems, covering unified formatting, stack traces, log levels, complete parameters, data masking, asynchronous logging, traceability, dynamic level adjustment, structured storage, and intelligent monitoring to help developers quickly diagnose issues and improve system reliability.
Preface
During last year's Double‑11 promotion I stared at a monitoring screen filled with red alerts, opened the server logs and saw messages like "User login failed", "Order creation error null", and "ERROR illegal argument". I realized that a programmer who writes poor logs is like a doctor who cannot write a medical record.
This article shares ten rules for writing high‑quality logs.
Rule 1: Unified Format
Bad example (management will deduct money):
<code>log.info("start process");
log.error("error happen");</code>No timestamp, no context.
Correct configuration (logback.xml core pattern):
<code><!-- logback.xml core pattern -->
<pattern>%d{yy-MM-dd HH:mm:ss.SSS} |%X{traceId:-NO_ID} |%thread |%-5level |%logger{36} |%msg%n</pattern></code>The unified pattern includes time, traceId, thread, level, logger, and message.
Rule 2: Include Stack Trace
Bad example (colleagues want to hit you):
<code>try {
processOrder();
} catch (Exception e) {
log.error("处理失败");
}</code>No stack trace, making debugging extremely hard.
Correct usage:
<code>log.error("订单处理异常 orderId={}", orderId, e); // e must be present!</code>The log now records the orderId and the exception stack.
Rule 3: Reasonable Levels
Bad example:
<code>log.debug("用户余额不足 userId={}", userId); // business exception should be WARN
log.error("接口响应稍慢"); // normal timeout should be INFO</code>Incorrect level usage.
Typical level usage:
FATAL : System about to crash (OOM, disk full)
ERROR : Core business failure (payment failure, order creation error)
WARN : Recoverable exception (retry succeeded, degradation triggered)
INFO : Key process nodes (order status change)
DEBUG : Debug information (parameter flow, intermediate results)
Rule 4: Complete Parameters
Bad example (operations angry):
<code>log.info("用户登录失败");</code>Only the text "User login failed" is printed.
Detective log:
<code>log.warn("用户登录失败 username={}, clientIP={}, failReason={}", username, clientIP, "密码错误次数超限");</code>The log now records who, where, and why the login failed.
Rule 5: Data Masking
Bleeding case: A colleague leaked a user's phone number in logs.
Use a masking utility:
<code>// Masking utility
public class LogMasker {
public static String maskMobile(String mobile) {
return mobile.replaceAll("(\\d{3})\\d{4}(\\d{4})", "$1****$2");
}
}
// Usage
log.info("用户注册 mobile={}", LogMasker.maskMobile("13812345678"));</code>Rule 6: Asynchronous for Performance
Problem reproduction: Synchronous logging during a flash‑sale caused many threads to block.
<code>log.info("秒杀请求 userId={}, itemId={}", userId, itemId);</code>Analysis:
Synchronous writes cause frequent context switches.
Disk I/O becomes the bottleneck.
Logging can consume up to 25% of total response time under high load.
Correct demo (three‑step configuration):
Step 1: Async appender configuration
<code><!-- AsyncAppender core config -->
<appender name="ASYNC" class="ch.qos.logback.classic.AsyncAppender">
<discardingThreshold>0</discardingThreshold>
<queueSize>4096</queueSize>
<appender-ref ref="FILE"/>
</appender></code>Step 2: Optimized logging code
<code>// No pre‑check, framework handles automatically
log.debug("接收到MQ消息:{}", msg.toSimpleString());
// Avoid heavy computation before async
// Wrong: log.debug("详细内容:{}", computeExpensiveLog());</code>Step 3: Performance key formula
<code>Max memory ≈ queue length × avg log size
Recommended queue depth = peak TPS × tolerated delay (seconds)
Example: 10000 TPS × 0.5 s → 5000 queue size</code>Risk mitigation strategies:
Prevent queue buildup: monitor usage and trigger alerts at 80%.
Prevent OOM: restrict large toString() calls.
Emergency escape: expose a JMX interface to switch back to synchronous mode.
Rule 7: Traceability
Chaos scenario: Cross‑service calls cannot be linked.
Inject a traceId into MDC:
<code>// Interceptor inject traceId
MDC.put("traceId", UUID.randomUUID().toString().substring(0,8));
// Log pattern includes traceId
<pattern>%d{HH:mm:ss} |%X{traceId}| %msg%n</pattern></code>Rule 8: Dynamic Log Level Adjustment
Sometimes we need to turn on DEBUG logs without restarting the service.
<code>@GetMapping("/logLevel")
public String changeLogLevel(@RequestParam String loggerName,
@RequestParam String level) {
Logger logger = (Logger) LoggerFactory.getLogger(loggerName);
logger.setLevel(Level.valueOf(level)); // immediate effect
return "OK";
}</code>Dynamic adjustment avoids service restarts.
<code>journey
title 日志级别动态调整
section 旧模式
发现问题 -> 修改配置 -> 重启应用 -> 丢失现场
section 新模式
发现问题 -> 动态调整 -> 立即生效 -> 保持现场</code>Rule 9: Structured Storage
Unstructured log strings are hard to query.
Store logs in JSON format:
<code>{
"event": "ORDER_CREATE",
"orderId": 1001,
"amount": 8999,
"products": [{"name":"iPhone","sku":"A123"}]
}</code>This makes the data machine‑friendly and easy to search.
Rule 10: Intelligent Monitoring
Failed case: error logs piled for three days before being discovered.
Introduce an ELK monitoring solution.
Example alert rules:
<code>ERROR logs > 100 for 5 minutes → phone alert
WARN logs continuous 1 hour → email notification</code>Conclusion
Three realms of R&D staff:
Bronze : System.out.println("error!")
Diamond : Standardized logs + ELK monitoring
King : Log‑driven code optimization, anomaly prediction system, root‑cause analysis AI model
Final soul question: Next time an online incident occurs, can your logs help a newcomer locate the issue within five minutes?
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.