Memory Structure Selection and Optimization for Hotel Query Service
This article examines how Ctrip's hotel query service selects and optimizes in‑memory cache structures—covering Java object layout, HashMap overhead, alternative collections, and various encoding techniques—to achieve high‑performance reads and updates while drastically reducing memory consumption.
The Ctrip hotel query service is a core backend component that provides a unified interface for dynamic hotel data, handling billions of cached entries across local memory and Redis while maintaining data consistency and sub‑second update latency.
Memory Structure Selection
To meet high‑frequency read requirements (thousands of reads per request) and rapid update needs (especially for price data), the service evaluates data structures based on read speed, thread safety, and memory overhead. Arrays and hash tables offer O(1) lookups, while trees provide O(log N) performance but are less suitable for large scales.
Thread‑Safe Cache Requirements
Standard HashMap is not thread‑safe; concurrent modifications can cause incorrect reads or even infinite loops. Therefore, thread‑safe structures such as ConcurrentHashMap are preferred, though they still incur significant memory overhead.
Java Object Memory Model
A Java object consists of an object header (mark word and class pointer), instance data, and alignment padding. On a 64‑bit JVM with compressed oops, a simple object with an int and a byte occupies 24 bytes.
HashMap Memory Overhead
Experiments with HashMap show that the internal structure consumes over 55 % of total memory across various data sizes. The hash bucket array includes an object header, length field, and N references, leading to substantial unused capacity due to power‑of‑two sizing and load‑factor based resizing.
Alternative Collections
Third‑party collections were benchmarked:
ConcurrentHashMap – thread‑safe with fine‑grained locking.
SparseArray – Android‑specific, lightweight replacement for HashMap .
Guava Cache – segment‑based locking for high concurrency.
fastutil – primitive‑type collections that avoid boxing overhead.
Results indicate that fastutil and SparseArray can achieve lower memory footprints than native HashMap for integer keys.
Encoding Techniques for Further Compression
Various lossless encodings were applied to reduce per‑entry size:
Bitmap (BitSet) – stores boolean or enumerated flags as single bits.
Run‑Length Encoding (RLE) – compresses consecutive identical values.
Dictionary encoding – deduplicates repeated strings by storing a reference.
Delta (difference) encoding – converts sparse keys (e.g., dates) into compact integer offsets.
Example code for an enum used in bitmap encoding:
enum Season { Spring, Summer, Fall, Winter; }Case Study 1: Room Basic Information
Over a billion room records were compressed using bitmap encoding for boolean/enumerated fields and dictionary encoding for highly duplicated entities, reducing the cache size to less than 2 % of its original footprint.
Case Study 2: Daily Price Information
Price data per room per day was optimized by:
Dictionary‑encoding distinct price values.
Delta‑encoding dates to integer offsets.
Bitmap‑encoding the price index (e.g., 2 bits for three possible prices).
Applying RLE to trailing repeated prices.
These steps shrank a single price object from several hundred bytes to 31 bytes, achieving roughly 60 % compression in production.
Conclusion
By carefully selecting thread‑safe map structures, understanding Java object layout, and applying appropriate encoding schemes, large‑scale in‑memory caches can be dramatically reduced in size while preserving high read/write performance, enabling cost‑effective scaling of backend services.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.