Big Data 12 min read

Efficient High‑Concurrency Data Retrieval Using Inverted Index and Bitmap Techniques

This article explores how to achieve fast, scalable data retrieval in million‑level high‑concurrency scenarios by replacing naïve full‑combination rule matching with column‑wise inverted indexes and bitmap operations, dramatically reducing time complexity and improving stability while leveraging RoaringBitmap compression for space efficiency.

JD Tech
JD Tech
JD Tech
Efficient High‑Concurrency Data Retrieval Using Inverted Index and Bitmap Techniques

The article investigates efficient data retrieval and processing in million‑level high‑concurrency environments, focusing on the implementation of inverted indexes and bitmap calculations to accelerate search and reduce storage overhead.

Background : The Promise order‑control system serves multiple nodes before and after order placement, handling the highest concurrency in the system (over one million TPS during peak periods). The core technical challenge is quickly and correctly matching user requests against a rule library containing hundreds of thousands of rules.

Naïve Solution : Cache the rule library line‑by‑line in Redis (key = rule condition, value = rule result). For each request, generate all possible parameter combinations and attempt to match them, leading to a time complexity of 2ⁿ (n≈12) and worst‑case 4096 calculations per request.

Figure 1.

New Solution : Adopt column‑wise inverted indexes combined with bitwise operations. Each column’s values are mapped to posting lists (row IDs). Bitmaps are generated for each posting list, and query parameters are used as keys to retrieve matching bitmaps. Column‑level OR operations handle empty values, while inter‑column AND operations yield the candidate rule set.

Figure 2.

Algorithm Details :

Pre‑compute column inverted indexes and corresponding bitmaps (Posting Lists).

During a request, use request parameters as keys to fetch matching bitmaps.

Perform column‑wise OR for each column (including empty values) and then inter‑column AND to obtain the candidate rule set.

Rank candidate rules by business priority and iteratively apply AND operations to find the optimal rule.

Figure 3.

Complexity Analysis : The new approach reduces time complexity from exponential to linear (≈3n ≈ n). Space complexity increases due to storing posting lists for each column, but this is mitigated by compressing bitmaps with RoaringBitmap, achieving up to 100× space savings.

Engineering Considerations – Bitmap Compression : Sparse bitmaps can waste memory; RoaringBitmap stores only populated blocks, drastically reducing memory usage (e.g., 1.2 MB vs. 144 bytes for a 10 million‑bit bitmap).

Figure 4.

Applicability : The method works well for equality queries with simple posting lists but is unsuitable for range or LIKE queries due to the need for exhaustive key enumeration.

Other Optimizations : Alternatives such as skip‑lists or hash tables (e.g., Elasticsearch’s skip‑list) can further accelerate intersect operations, achieving logarithmic time complexity.

Figure 5.

rule engineperformance optimizationhigh concurrencyBitmapRoaringBitmapInverted Index
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.