Scaling the Health Code: Tencent Cloud Elasticsearch at Billion-User Scale
Leveraging Tencent Cloud Elasticsearch, the nationwide COVID‑19 health code platform handled over 1.6 billion scans for more than 900 million users, achieving millisecond‑level search, seamless horizontal scaling, multi‑zone high availability, and robust security, while simplifying development through RESTful APIs and rich UI tools.
Elasticsearch is a hot open‑source distributed search and analytics engine; with simple deployment it enables real‑time log analysis, full‑text search, and structured data analytics, dramatically lowering the cost of extracting value from data. This article introduces Tencent Cloud Elasticsearch Service in the "Tencent Epidemic Prevention Health Code" application, covering challenges, optimization ideas, and results.
On February 9, Shenzhen became the first city in China to launch a health code. Since then, the health code has spread nationwide, now deployed in 20 provincial regions covering over 300 cities. It has recorded more than 1.6 billion scans, covering over 900 million people, with total visits exceeding 6 billion. Architects and developers face trillion‑level data access challenges and need rapid iteration support.
Choosing Elasticsearch and Technical Considerations
Support for structured information queries such as passage time and vehicle details.
Support for long‑text fields like street, community, and residential area names.
Ability to quickly add or remove fields to adapt to epidemic control requirements.
Support for keyword search, massive data aggregation, and geographic region calculations.
Comparison of traditional relational database MySQL and Tencent Cloud Elasticsearch:
MySQL excels in transactional applications and multi‑table joins but struggles with complex data types and text keyword search. Tencent Cloud Elasticsearch, built on the Lucene query engine with an inverted index, delivers millisecond‑level query responses even at trillion‑scale data, offering nearly a hundred‑fold improvement over MySQL's LIKE queries.
Comparison of MongoDB and Tencent Cloud Elasticsearch:
MongoDB, a popular NoSQL product, supports flexible schemas and field changes but lacks robust text search and large‑scale aggregation capabilities. Tencent Cloud Elasticsearch provides a doc_value column‑store structure and an aggregation framework with up to 60 operators, enabling keyword bucketing, time bucketing, distance bucketing, averages, sums, and geographic boundary calculations.
Elasticsearch’s graphical UI components, combined with Kibana, allow visual analysis of massive data and simplify operational analytics through configurable dashboards, giving epidemic control teams clear insight into the situation.
Tencent Cloud Elasticsearch is simple to use: it offers a RESTful API, more than 10 official SDKs and over 20 community SDKs covering most mainstream programming languages. This rich ecosystem reduces coding effort and accelerates rapid development for urgent business needs.
Rapid Data Growth and Fast Scaling
Within about a month, the health code system covered 9 hundred million users with 1.6 billion scans. Handling such explosive query growth is a major challenge for storage systems.
Tencent Cloud Elasticsearch adopts a distributed architecture where index data is partitioned into multiple shards distributed across cluster nodes. This enables linear scaling of write and query throughput, a capability not found in single‑instance databases.
Cluster configuration is flexible and fast. Because the service runs on Tencent Cloud IaaS with CVM and CBS disks, storage can be expanded dynamically without affecting node operation. For health‑code‑scale data, expanding storage now takes seconds instead of hours or days, greatly reducing operational complexity.
High‑Availability Architecture
Tencent Cloud Elasticsearch supports multi‑availability‑zone (AZ) clusters. If one AZ becomes unavailable due to power or network issues, nodes in other AZs continue to provide uninterrupted service. Data is sharded with replicas; replicas are evenly distributed across AZs, ensuring automatic failover and continuous write/query capability.
To prevent out‑of‑memory (OOM) crashes during massive aggregations, Elasticsearch implements a memory‑based circuit‑breaker mechanism. When JVM usage exceeds a threshold, the service degrades gracefully, intercepting queries at the REST layer and ultimately halting requests if usage exceeds 95%, protecting the cluster from collapse.
Data Security
The health‑code system uses the highest‑level security features of Tencent Cloud Elasticsearch, including inbound/outbound IP whitelist, cluster permission authentication, and VPC network isolation, eliminating data leakage risks and preventing unauthorized access.
Backup and Recovery
Incremental data backups are supported via native index lifecycle management, which can regularly store snapshot files to Tencent Cloud Object Storage (COS). Backups can be restored to any cluster on demand, ensuring reliable data protection.
Conclusion
The Tencent epidemic‑prevention health code is a vital credential for personnel movement and pandemic control. Its widespread adoption is closely tied to the technical capabilities of Tencent Cloud Elasticsearch in search, high concurrency, elastic scaling, and security. Going forward, Tencent Cloud Elasticsearch will continue to iterate, delivering stable and reliable Elasticsearch services to meet evolving user needs.
Tencent Tech
Tencent's official tech account. Delivering quality technical content to serve developers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.