Binary Code Based Hash Embedding for Efficient Deep Recommendation Models
Binary Code based Hash Embedding (BH) dramatically compresses deep recommendation model storage by converting feature IDs to binary codes and partitioning them into flexible blocks, yielding deterministic, collision‑free indices that achieve up to 1,000× size reduction while retaining about 99% of original accuracy, making it ideal for storage‑constrained deployments.
Deep learning models deployed in web‑scale applications such as recommendation and advertising systems heavily rely on ID‑type feature embedding. The conventional full‑embedding approach assigns a distinct vector to each feature value, which leads to prohibitive storage costs (e.g., a 5‑million‑ID vocabulary may require hundreds of gigabytes).
This paper proposes a Binary Code based Hash Embedding (BH) method that dramatically compresses embedding storage while preserving model accuracy. Each feature’s hash ID is first converted to a binary code. A flexible Code Block Strategy then partitions the binary bits into blocks; each block is decimalized to obtain an embedding index. The process is deterministic, lossless, and performed online, allowing new IDs to be handled without pre‑stored indices.
The BH pipeline consists of three stages: (1) Feature Hash – mapping raw IDs (string, integer, etc.) to integer hash IDs; (2) Embedding Index Generation – Binarization, Code Block Strategy (e.g., Succession or Skip), and Decimalization; (3) Embedding Vector Generation – lookup of block embeddings (shared across blocks) followed by Fusion (sum‑pooling) to produce the final feature vector.
Key advantages of BH include:
Deterministic, parameter‑free index computation suitable for unseen IDs.
Flexibility: the number and size of blocks can be tuned via a hyper‑parameter to meet diverse storage constraints, from cloud servers to mobile devices.
Uniqueness: regardless of compression level, each feature value maps to a unique set of indices, avoiding the collision‑induced performance degradation of traditional hash‑mod methods.
Extensive experiments were conducted on three datasets (Alibaba, Amazon, MovieLens) comparing BH against Full Embedding, Hash Embedding, Multi‑Hash Embedding, and the Q‑R trick. In CTR prediction tasks, BH consistently achieved the highest AUC while reducing embedding size up to 1,000× (maintaining ~99% of the original model’s accuracy). Storage‑reduction benchmarks showed BH’s model size to be one‑thousandth of the full model at the same accuracy level, far surpassing other baselines. Both Succession and Skip block strategies performed comparably and outperformed the best baseline (Q‑R). Convergence analysis revealed that BH converges faster and reaches lower loss than competing methods.
In summary, BH provides a lossless, highly flexible embedding indexing scheme that enables substantial model size reduction without sacrificing predictive performance, making it especially suitable for storage‑sensitive scenarios such as mobile deployment.
Alimama Tech
Official Alimama tech channel, showcasing all of Alimama's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.