Frontend Development 24 min read

Ensuring Frontend System Stability through Monitoring and Automated Inspection

This article explains how modern front‑end teams ensure system stability and high‑quality operation by implementing comprehensive monitoring and automated inspection, covering background, significance, architecture, real‑time and scheduled checks, performance metrics, alert strategies, error handling, custom reporting, and future improvement plans.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Ensuring Frontend System Stability through Monitoring and Automated Inspection

Modern front‑end applications have complex scenarios and user experience is critical; ensuring stability and sustainability is a key challenge.

Background

Complexity increases with SPA and PWA adoption, higher performance expectations, diverse devices and networks, and agile CI/CD practices that require rapid issue response.

Significance

Performance monitoring improves user experience.

Error tracking enables quick resolution.

User behavior analysis guides product iteration.

Business metric monitoring ensures core flow stability.

Alert systems allow rapid response to anomalies.

Monitoring Categories

Two parts: real‑time monitoring (integrated with the SGM platform, covering 100+ applications with alert mechanisms) and scheduled task inspection (automated cron jobs that report results and trigger alerts).

Overall Architecture

Real‑time Monitoring

Integration with the SGM monitoring platform provides multi‑channel alerts (e.g., DingTalk, email, phone). Alert strategy balances precision and sensitivity, uses hierarchical levels, regular rule optimization, clear responsibility assignment, and team training.

Web‑end performance metrics include LCP, CLS, FCP, FID, and TTFB with thresholds (e.g., LCP ≤ 2.5 s). Alerts were tuned by temporarily raising the LCP threshold to 5 s to reduce noise while still tracking performance impact.

Page Performance

Monitoring covers the full page‑load lifecycle, white‑screen detection, and configuration of URLs (supporting regex) to capture first‑content‑fulfilment times. Metrics such as white‑screen time help identify and optimise pages that affect first‑impression experience.

JSError Monitoring

Error keywords are configured to match console messages; thresholds are set based on QPS because appropriate degradation strategies keep pages functional despite errors. Cross‑origin “Script error” is handled by enabling CORS and adding a Vue error handler.

<script src="http://xxxdomain.com/home.js" crossorigin></script>
Vue.config.errorHandler = (err, vm, info) => {
if (err) {
try {
console.error(err);
window.__sgm__.error(err);
} catch (e) {}
}
};

API Request Monitoring

Alerts focus on HTTP status codes and business error codes. Data‑collection parameters are configured, and a standardized mapping of error codes is applied across services.

{
"50000X": "Program exception, internal",
"500001": "Program exception, upstream",
"500002": "Program exception, xx",
"...": "..."
}

Resource Errors

Monitoring includes loading failures of CSS, JS, and images. Degradation strategies are applied for image errors, and non‑essential image‑error collection can be disabled.

Custom Reporting

Key business nodes report detailed request/response data, user‑behavior traces, and specific failure cases such as address‑selection errors in H5 pages embedded within apps. Custom logs are used to locate and resolve issues quickly.

Mini‑Program Monitoring

Combines SGM monitoring with official mini‑program analysis tools to capture performance, JavaScript errors, and resource‑loading issues specific to mini‑programs.

Native App Monitoring

Basic monitoring via mPaaS (crash), Zhulong (startup time, first‑screen, stutter), and SGM (network, WebView, native pages). Business monitoring is applied to login, product‑detail, and order pages, with custom SDKs reporting abnormal flows.

Scheduled Inspection

Implemented through the UI “Woodpecker” platform and custom Node.js scripts. The tool checks link validity, hover and click interactions, and reports results. Example configuration snippets are shown below.

{
"cookieThor": "",
"urlPattern": "pro\\.jd\\.com",
"urls": ["https://b.jd.com/s?entry=newuser"]
}
{
"cookieThor": "",
"urlPattern": "///pro.jd.com/",
"urls": [{
"url": "https://b.jd.com/s?entry=newuser",
"clickElements": [{
"item": ".recommendation-product-wrapper .jdb-sku-wrapper"
}]
}]
}

Problems Discovered

JS errors in closed environments, resource‑loading failures, and business‑flow exceptions were identified. Mitigations include try‑catch wrappers, queue mechanisms for deferred reporting, and refined alert thresholds.

Summary

Before monitoring, issues were discovered reactively via user feedback; after integration, proactive detection, performance optimisation, error‑rate reduction, and stable releases were achieved.

Future Planning

Goals: raise >90 % of applications above a 90‑point performance score, deepen custom exception reporting (e.g., button‑visibility errors), and upgrade inspection tools for broader coverage and smarter automation.

frontendmonitoringperformancewebautomationDevOpsalerting
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.