Fundamentals 7 min read

Understanding Bloom Filters and Their Support in Redis

This article explains the probabilistic Bloom filter data structure, its characteristics and typical use cases such as cache‑penetration prevention, details its simple implementation steps, demonstrates how Redis (via Redisson) provides built‑in Bloom filter support with Java code examples, and summarizes its practical benefits.

Full-Stack Internet Architecture

Dec 18, 2024

Understanding Bloom Filters and Their Support in Redis

Bloom filters are probabilistic data structures that use minimal space and offer high efficiency, answering whether an element is definitely not in a set or possibly present.

Despite the seemingly vague "possibly present" result, this property is valuable in scenarios like checking if a username is already registered, whether a user has seen a specific news item, or if a key exists in cache.

The implementation is straightforward: initialize a fixed‑size bit array set to 0; when adding an element, compute several hash functions, map each result modulo the array length, and set the corresponding bits to 1; when querying, if any of the bits is 0 the element is definitely absent, otherwise it may exist, acknowledging possible false positives due to hash collisions.

Illustrative examples show adding elements "Tom" and "John" to an 8‑bit array, where overlapping hash results demonstrate collision handling, and querying elements like "Eric" (definitely absent) versus "Jack" (false positive).

Redis supports Bloom filters through the Redisson Java client. The following code creates a Bloom filter named userList, initializes it for an expected 1,000,000 elements with a 1% error rate, adds an element, and checks for its existence:

// Build a Bloom filter and specify its name
RBloomFilter<String> bloomFilter = redissonClient.getBloomFilter("userList");
// Initialize: expected 1,000,000 elements, 1% false‑positive rate
bloomFilter.tryInit(1000000L, 0.01);
// Add an element
bloomFilter.add("1234567");
// true – element exists
System.out.println(bloomFilter.contains("1234567"));
// false – element does not exist
System.out.println(bloomFilter.contains("12345"));

In practice, before inserting data into a persistent store (e.g., MySQL), the element is first added to the Bloom filter; subsequent insertions first check the filter to avoid unnecessary database or cache operations.

To mitigate cache penetration, the typical workflow is: (1) create a Bloom filter and add new data to it after persisting; (2) on a read request, query the Bloom filter first—if the element is absent, return immediately; (3) if the filter indicates possible presence, proceed to the cache/database as usual.

In summary, although Bloom filters provide probabilistic results, they are highly effective in specific scenarios like preventing cache penetration, and Redis’s native support via Redisson makes their integration straightforward for developers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Redis bloom-filter Redisson cache-penetration probabilistic data structure

Written by

Full-Stack Internet Architecture

Introducing full-stack Internet architecture technologies centered on Java

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.