Backend Development 12 min read

Optimizing Lucene Stored Fields Access with a Custom Codec and In‑Memory Caching

This article describes how the Qunar hotel search team reduced Lucene stored‑fields deserialization overhead and GC pressure by implementing a custom Codec that caches stored fields in memory, redesigning the storage format, and evaluating the performance and space benefits of the approach.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Optimizing Lucene Stored Fields Access with a Custom Codec and In‑Memory Caching

Background: Qunar hotel search and suggest services, built on Lucene, suffered from slow response times and frequent Young GC when retrieving large numbers of stored fields because each document’s stored fields required decompression and deserialization into Java objects.

The team analyzed the problem, noting that Lucene’s StoredFields are row‑oriented and heavily compressed, leading to high CPU and memory overhead during bulk retrieval. They considered alternatives such as disabling compression or using DocValues, but concluded that a custom in‑memory cache for stored fields would best meet their needs.

Lucene Custom Codec Mechanism: Lucene uses a codec API to read and write index files, separating storage format from indexing and search logic. By extending FilterCodec , only the StoredFieldsFormat implementation needs to be overridden while delegating other formats to the default Lucene80 codec.

Custom StoredFieldsFormat Implementation: The solution loads all stored‑field data of a segment into memory once, avoiding repeated deserialization. The primary node builds the index normally with a custom codec; the replica node reads the segment, uses a custom StoredFieldsReader that caches data in memory, and serves subsequent document requests directly from the cache.

In‑Memory Storage Structure: To keep memory usage reasonable, the team chose a column‑oriented layout for cached fields. For multi‑valued fields they store values in a contiguous value array and use an offset array to locate each document’s slice. String values are interned per segment to eliminate duplicates, and a succinct rank/select bit‑vector further reduces the offset storage by ~20%.

Results: The optimization reduced Young GC frequency from 2‑3 times per second to once every 9‑10 seconds and cut response latency by over 80%. Memory consumption dropped to about 65% of a naïve Java‑bean approach while only incurring a ~10% speed penalty for access.

Future Work: The authors plan to explore off‑heap storage to further lower heap pressure and to implement per‑field caching so that only frequently accessed fields are kept in memory.

javaperformancecachingLucenecodecstored fields
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.