Operations 12 min read

Business Monitoring Practices and Log Configuration for KA Merchant Services

This article details the correlation between system and business metrics, introduces three generic business‑monitoring platforms (UMP, PFinder, Taishan), defines a unified log format, provides Log4j and Java logging code, and explains alert rule configurations, visualizations, and real‑world incident case studies to improve operational reliability.

JD Tech
JD Tech
JD Tech
Business Monitoring Practices and Log Configuration for KA Merchant Services

Correlation of System‑Level and Business‑Level Metrics

In routine operations, system‑level metric anomalies usually coincide with business‑level metric anomalies, but the reverse is not always true, causing delayed detection of business issues and potential large‑scale impact.

Generic Business Monitoring Platforms

Three internal DevOps platforms—UMP, PFinder, and Taishan—are used for business monitoring. UMP is now offline but still referenced; PFinder tracks package volume thresholds and triggers alerts; Taishan is the most widely used, covering unified log format, coding practice, data visualization, alerting, and best practices.

Unified Log Format

The log format includes fields such as business domain, sub‑domain, scenario, channel source, merchant code, density, result (Y/N), result code and description, sub‑code, merchant order number, and waybill number.

|业务域|业务子域|业务场景|渠道来源|商家编码|密度|结果(Y/N)|结果码|结果码描述|结果子码|结果子码描述|商家单号|订单号|运单号

Log4j Configuration

Typical Log4j XML configuration uses a patternLayout and a RollingRandomAccessFile appender for business logs.

<property name="patternLayout">%d{yyyy-MM-dd HH:mm:ss.SSS}-%X{PFTID}-%-5p - [%t] %c -%m%n</property>
<RollingRandomAccessFile name="businessFile" fileName="${log_path}/eclp-biz-eclp-isv-business.log" filePattern="${log_path}/eclp-biz-eclp-isv-business-%i.log">
    <PatternLayout charset="UTF-8" pattern="${patternLayout}"/>
    <Policies>
        <SizeBasedTriggeringPolicy size="1GB"/>
    </Policies>
    <DefaultRolloverStrategy max="5"/>
</RollingRandomAccessFile>
<AsyncLogger name="BusinessLogger" level="INFO" additivity="false" includeLocation="false">
    <AppenderRef ref="businessFile"/>
</AsyncLogger>

Business Logging Code

/** Business log */
private static final Logger blogger = LoggerFactory.getLogger("BusinessLogger");

blogger.info("|订单域|销售出|下单|{}|{}|{}|{}|{}|{}|{}|{}|{}|{}|{}|{}",
    order.getSourceChannel(), order.getShopNo(), order.getDepartmentNo(), 1,
    result, code, message, subCode, subMessage,
    order.getIsvUUID(), context.getPin(), soNo);

Alert Rule Configurations

Examples include success‑rate alerts (triggered when consecutive intervals fall below 50%), sudden volume spikes or drops, and custom thresholds based on historical comparisons.

Case Studies

Various merchant incidents—such as low success rates due to warehouse switches, duplicate submissions, inventory shortages, department changes, product‑level adjustments, external API timeouts, OAID verification failures, and upstream traffic anomalies—are analyzed to illustrate how the monitoring system detects and helps resolve issues.

Best Practices and Conclusion

Continuous optimization of alert thresholds, real‑time dashboards, and close collaboration between R&D, testing, and operations teams enable rapid detection and remediation, ultimately improving system availability and merchant experience.

operationsalertinglog4jData VisualizationBusiness MonitoringLog Configuration
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.