Milvus Vector Database Performance Testing and Architecture Analysis
The author stress‑tested Milvus 2.1.4’s cloud‑native, micro‑service architecture—detailing its write and search paths, evaluating FLAT index performance across 100 K to 10 M 512‑dim vectors, uncovering scaling, scheduler, segment‑rebalance, and upgrade issues, and concluding the system is robust but benefits from graph‑based indexes and Helm‑driven scaling.
The author recently conducted stress testing on Milvus, a cloud‑native vector database, and performed a detailed walkthrough of its source code and documentation.
Vector search, also known as Approximate Nearest Neighbor Search (ANNS), aims to find the N nearest vectors to a target vector. Common index families include tree‑based, graph‑based, hash‑based, and quantization‑based, with graph‑based indexes offering high recall and performance.
Milvus (v2.0+) is a cloud‑native vector database that supports vector insertion and ANNS. It integrates popular similarity libraries such as Faiss and SPTAG, and orchestrates data and compute resources to achieve optimal search performance.
The system architecture consists of many micro‑services: Proxy (client entry), RootCoord (global metadata and timestamp allocation), DataCoord & DataNode (segment management and data persistence), IndexCoord & IndexNode (index creation), QueryCoord & QueryNode (search execution), MetaStore (metadata stored in etcd), Log Broker (Pulsar‑based message queue), and Object Store (MinIO for data and index files). Communication among services follows the patterns shown in the architecture diagram.
Write path: the client sends data to Proxy, which produces messages to the physical channel of the Message Broker; DataNode consumes, serializes, and stores data to the Object Store, then notifies DataCoord. Search path: Proxy forwards ANNS requests to the shard‑leader QueryNode, which distributes the request to relevant QueryNodes based on segment distribution; results are reduced and returned to the client.
Performance tests were run on Milvus 2.1.4 with 512‑dim vectors using the FLAT index. Test configurations included 100 K, 1 M, and 10 M vectors on clusters of varying CPU/Memory. Sample results: 100 K vectors achieved up to 1 489 QPS (62 ms 99th percentile) on a 2 × (16 CPU × 16 Gi) node; 10 M vectors reached only 20 QPS with 1.98 s latency on a 2 × (16 CPU × 32 Gi) node.
Key issues identified: (1) QPS plateaued due to the scheduler.cpuRation parameter limiting CPU usage per search task; (2) after scaling query nodes, segments did not rebalance automatically; (3) prolonged high‑rate inserts left many segments in a growing state, causing slow brute‑force queries; (4) data incompatibility after upgrading Milvus versions; (5) FLAT index performance degraded at tens of millions of vectors, suggesting a switch to graph‑based indexes; (6) deployment‑based scaling caused configuration drift, recommending Helm‑based scaling instead.
These problems are being tracked with the Milvus community, and workarounds or future releases are expected to address them.
Conclusion: Milvus satisfies the current business scenario, and its cloud‑native, distributed design (read/write separation, compute‑storage separation) is robust. Remaining edge‑case issues will be further investigated, and future tests will explore graph indexes for higher performance.
DeWu Technology
A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.