Operations 11 min read

Qunar Ticket Test‑Environment Governance and Automated Monitoring Framework

This article describes Qunar Ticket’s comprehensive test‑environment governance framework, including the “Mirror‑Inspect” monitoring service, configuration and data synchronization strategies, and automated allocation management, highlighting how these practices reduced environment‑related project delays from up to 20% to below 8%.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Qunar Ticket Test‑Environment Governance and Automated Monitoring Framework

Background

During projects, problems with test environments are a common source of delays, missed tests, and incorrect tests. Ensuring test environments are available, usable, and standardized is a universal expectation.

Container‑based frameworks can speed up environment construction, but limited hardware resources, build performance, complex calls and large data volumes make it difficult to achieve on‑demand, rapid, disposable environments for complex transaction systems.

Therefore, beyond fast environment provisioning, it is necessary to improve the stability and availability of existing environments.

Environment Governance Approach

The governance can be divided into six elements:

Environment availability

Proactive repair

Data standardization

Configuration standardization

Test code control

Reasonable resource allocation

Environment Availability

Availability includes deliverable readiness, component and machine health, smooth business links, continuous availability during use, and visual platform/report feedback.

During environment creation, monitoring servers and application start‑up success/failure provides basic availability feedback. Additional concerns are addressed by the “Mirror‑Inspect” service.

Mirror‑Inspect Service

The service provides two core functions: (1) “illuminate” anomalies and basic status of the environment, and (2) proactive repair.

Implementation: a Mirror‑Inspect server is deployed on each test machine, managed via Salt API. It starts a monitor that collects information such as VM status, service health, disk, load, memory, deployment and version details, and persists them. Automatically fixable issues are repaired.

Examples: automatically clean Tomcat logs when disk usage exceeds a threshold; restart Tomcat if the service is down and alert if restart fails; collect non‑master branch information and display it on the platform; perform business‑link validation and aggregate results for dashboards.

Architecture diagram:

Server deployment example:

Note: the HTTP endpoint provides detailed server and application information for the test environment.

Basic checks and repairs:

Results are displayed on the platform:

Configuration Synchronization Strategy

Hot‑config systems are increasingly used for lightweight business logic and feature toggles. Discrepancies between online and test configurations cause missed or erroneous tests.

Automatic synchronization leverages the fallback node of the hot‑config system. The steps are:

Move all files from the online node to the fallback node, making the online node empty so the system reads from the fallback node.

Delete all files under the test node, causing it to read the fallback node as well, achieving unified configuration.

For configurations that must differ, a second‑stage replacement is performed before Tomcat starts.

Modify test configurations either by copying files to the test node (not recommended) or via the Mirror‑Inspect “business configuration” page, which records project and environment info and applies changes automatically, with lifecycle management.

Second‑stage replacement implementation:

Salt scripts are used for scheduling the replacements.

Data Synchronization Scheme

Database synchronization follows four principles:

When creating an environment, copy schema and data from production.

When production SQL changes, proactively sync to test.

Replace test data with predefined masks during sync.

Consider data volume and use MySQL copy files for asynchronous large‑scale sync.

DBA assistance is required to provide safe interfaces for data extraction and change notifications.

Code and Environment Allocation Management

Qunar’s Odin Desktop integrates JIRA, code projects, users, and deployment operations, consolidating environment allocation and Mirror‑Inspect information.

When a branch is released and JIRA is closed, Odin Desktop listens for the closure, triggers Mirror‑Inspect configuration management and deployment interfaces, restores master code to the test environment, and synchronizes other environments that are not occupied.

Conclusion

Two key metrics demonstrate the impact of test‑environment governance at Qunar Ticket:

Support for dozens of development and test environments with fully unattended operation, automatic repair, allocation, isolation, integration, and proactive alerts.

Project delay caused by environment issues dropped from 15‑20% before 2018 to below 8% in 2019.

The core ideas are: one‑click Mirror‑Inspect integration, end‑to‑end unattended environment, reliable metrics, closed‑loop strategies, and platform‑based project cycle management.

Colleagues with similar needs are encouraged to share experiences.

monitoringAutomationoperationsConfiguration Managementtest environment
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.