JD Cloud Developers
Apr 27, 2025 · Artificial Intelligence
Overcoming the Hourglass Effect in Residual Quantization for Generative Retrieval
This paper investigates the “hourglass” phenomenon in residual‑quantized semantic identifiers for generative search and recommendation, revealing that token concentration in intermediate codebooks causes path sparsity and long‑tail distributions, and proposes heuristic layer removal and adaptive token‑pruning strategies that markedly improve model performance.
generative retrievalhourglass phenomenonresidual quantization
0 likes · 13 min read