Operations 10 min read

Comprehensive Dependency Governance for High‑Availability Backend Systems

This article outlines a systematic approach to dependency governance in high‑traffic backend services, covering service classification, rate limiting, Dubbo, HTTP, database, and message‑queue management to enhance availability, reduce failure impact, and improve overall system stability.

Qunar Tech Salon

Jan 7, 2020

Comprehensive Dependency Governance for High‑Availability Backend Systems

Background

The authors previously shared a cache governance practice and now extend the stability governance to cover system‑level dependencies such as external components, interfaces, and the services they expose (Dubbo, HTTP, DB, MQ, etc.).

Governance Plan

Service Classification and Dependency Governance

1) Applications are graded (P1, P2, P3) based on business core importance and impact, and dependencies are mapped accordingly.

2) P1 services must be deployed across multiple data centers, ensuring that no single data center holds more than half of the online instances, thereby reducing the impact of a single‑site failure.

3) Strong dependencies are weakened to enable degradation; weak dependencies are made asynchronous to allow circuit‑breaking. Critical‑to‑critical calls receive pre‑planned fallback strategies, while non‑critical calls are isolated to prevent cascading failures.

Rate Limiting

The team adopts a unified Sentinel component for traffic control, providing dynamic rate limiting for Dubbo and HTTP interfaces, business‑level throttling based on request parameters, and optional cluster‑wide limits. Proper rate limiting is applied judiciously to avoid degrading user experience during normal traffic spikes.

Dubbo Governance

Key measures include monitoring Dubbo thread pools, isolating core and non‑core interfaces into separate thread pools, and configuring reasonable timeout values on both provider and consumer sides.

HTTP Governance

Practices involve setting appropriate timeout thresholds, encouraging asynchronous calls, implementing controlled retries, and isolating thread pools and clients to prevent cross‑interference.

Database Governance

High availability is ensured through multi‑replica storage, rapid recovery mechanisms, and removal of unnecessary data. Monitoring of query performance and MyBatis interceptors are employed for early detection of issues.

MQ Governance

The approach handles single‑MQ failures or message backlogs by enabling fast failover to alternative channels, using multiple topics or MQ clusters, and guaranteeing idempotent consumption to avoid data loss.

Additional Practices

Monitoring is enhanced for Dubbo, HTTP, and DB operations; dashboards include app‑code dimensions for quick inspection; and timeout configurations are regularly reviewed for optimal values.

Governance Process

The workflow mirrors previous cache governance: identify scenarios, define solutions, develop and test, deploy, and conduct online drills with iterative improvements. Deployment is staged, first adding rate‑limiting components and monitoring, then optimizing based on observed metrics.

Summary

Post‑incident reviews drive proactive measures that reduce failure frequency, duration, and impact. Dependency governance is an ongoing effort, with future plans to automate dependency tagging and integrate with dedicated service‑governance platforms for dynamic detection and rapid response.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Operations Dubbo dependency management Rate Limiting service reliability

Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.