Backend Development 12 min read

ECP (Elasticsearch Chain Planning) System: Design, Features, and Implementation for Efficient Index Management

The article introduces the ECP system, a backend platform built on Elasticsearch that standardizes, automates, and visualizes index refresh workflows, addressing manual bottlenecks, data cleaning challenges, and coupling issues while providing task management, permission control, and environment isolation for high‑efficiency index operations.

Zhuanzhuan Tech
Zhuanzhuan Tech
Zhuanzhuan Tech
ECP (Elasticsearch Chain Planning) System: Design, Features, and Implementation for Efficient Index Management

1. Business Background

Zhuanzhuan, a leading domestic circular‑economy company, uses a middle‑platform architecture where the middle platform provides generic transaction capabilities and the front‑end explores innovations. Their transaction middle platform includes services such as order, promotion, payment, each with dozens of Elasticsearch indices holding billions of records.

Rapid business growth made manual support for Elasticsearch (ES) requirements untenable, leading to the creation of the Elasticsearch Chain Planning (ECP) system.

2. Current Situation and Problems

2.1 Current Overview

Index rebuilding traditionally requires a lengthy 12‑step manual process, including identifying the index, editing templates, creating new indices, updating write handlers, configuring dual‑write, exporting IDs via Shell/Python scripts, uploading data, and finally switching aliases.

2.2 Existing Problems

Manual scripts for ID export face memory/disk limits; sandbox disables MySQL commands for security.

High cost and low efficiency of index rebuilding (5‑7 days for an order index).

No visibility into cleaning progress; cannot estimate completion time.

Lack of checkpoint‑resume; failures require manual intervention.

Mixing bulk cleaning and incremental data in the same queue can impact online services.

3. Solution Idea

Abstract Process Steps : Standardize, automate, and visualize the index refresh workflow to improve efficiency and accuracy.

System Empowerment : Provide task management features such as interruption recovery, progress visualization, QPS throttling, and heartbeat detection.

Isolation of Bulk and Incremental Data : Use tag‑based traffic routing to separate cleaning of historic data from live incremental data.

Permission Control and Data Consolidation : Integrate with the company’s unified permission system for managing data sources, scripts, ES clusters, templates, tasks, and operation logs.

4. Practical Reveal

4.1 What Is the ECP System?

ECP (Elasticsearch Chain Planning) is a platform for managing Elasticsearch data‑transfer chains, helping developers efficiently handle index creation, data cleaning, and index rebuilding tasks.

4.2 ECP System Functions

4.2.1 Task Management

Supports ES index creation, data cleaning, and index rebuilding tasks with modules for alias switching, dual‑write management, progress visualization, pause/resume, QPS limiting, and automatic recovery.

4.2.2 Data Source and Script Management

Manages database connection info and SQL scripts for source data extraction, offering connection testing and syntax validation.

4.2.3 Cluster and Index Management

Provides overview of index name, alias, disk usage, cluster, shard count, health status, and department ownership.

4.2.4 Index Template Management

Centralizes management of index templates used during index creation.

4.3 Problems Solved by ECP

4.3.1 Eliminated Manual Bottlenecks in ES Index Rebuilding

Automated ID export, script execution, and RPC triggering, removing the need for manual monitoring and retry, thus increasing efficiency and standardization.

4.3.2 Isolated Bulk Cleaning from Live Traffic

Used tag‑based routing to separate historic data cleaning from incremental data, preventing cleaning spikes from affecting user‑facing services.

4.3.3 Consolidated Scattered Indexes, Templates, and Scripts

Centralized assets to reduce time spent searching for previous scripts/templates, improving response speed and knowledge retention.

4.4 Terminology

4.4.1 Task

A defined activity with clear goals, time limits, and progress tracking, such as bulk ID cleaning or index building.

4.4.2 Index Cluster (cù) and Index

An abstract definition (cluster) and its concrete instances (indices), similar to interfaces and classes in Java.

4.4.3 Data Source

Sources include ID source, text source, and MySQL source.

4.4.4 Script

Combination of MySQL source and SQL script used to read source data.

4.5 Overall Design

4.6 System Demonstration

4.6.1 Create Task

4.6.2 Execute Task

5. Conclusion

5.1 Summary

ECP is a platform for managing Elasticsearch data‑transfer chains, offering a more efficient and convenient data‑cleaning solution that will continue to evolve with business needs.

5.2 Roadmap

Version 1.0 is in internal testing; future plans include scheduled cleaning tasks, reindex support, alias rollback, and data consistency checks.

5.3 Acknowledgements

Thanks to teammate Yan Zhan and the transaction team for their contributions and to the low‑code platform from Zhuanzhuan FE.

BackendautomationElasticsearchsystem designdata cleaningIndex Management
Zhuanzhuan Tech
Written by

Zhuanzhuan Tech

A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.