Multi‑Interest Vector Recall and PDN Models for Large‑Asset Recommendation in Alibaba Auction
Alibaba Auction improves large‑asset recommendation by deploying the multi‑interest vector recall model MIND and the two‑hop PDN model, adapting features and time weighting for unique, high‑value items, using hard‑negative sampling and combined rule‑based and vector similarity, which boosts conversion metrics while revealing filter‑bubble concerns.
Alibaba Auction recommends high‑value assets (real estate, vehicles, land, debt) using a two‑stage pipeline where the recall stage determines the upper bound of the system. The recall must return as many potentially interesting items as possible for downstream ranking.
Large‑asset items differ from ordinary e‑commerce goods: each item is unique, has a short exhibition period (about a month), and carries a very high price. These characteristics make learning per‑item representations difficult and require a more personalized recall strategy.
This article shares practical experiences of applying the multi‑interest vector recall model MIND and the deep I2I recall model PDN to the large‑asset scenario. It describes model architectures, specific optimizations for large assets, and the resulting performance gains.
MIND Model : Consists of an Embedding & Pooling layer, a Multi‑Interest Extractor layer based on dynamic‑routing capsule networks, and a Label‑aware Attention layer. The extractor captures multiple user interests (low‑level capsules = historical behaviors, high‑level capsules = interest vectors) and routes information via learned routing coefficients. Label‑aware Attention selects the interest most relevant to the target item for training, while all interests are used during online recall.
Large‑Asset Adaptation : Only basic category, region, and a few key attributes are used as features; adding finer‑grained features hurts performance. Time‑weighting is shifted from minute‑level buckets to day‑level buckets to reflect the longer decision cycle of asset buyers, improving offline hit‑rate@300 by ~5%.
Negative Sample Construction : Three types of hard negatives are introduced – (1) items exposed but not clicked on the same day, (2) items sampled from the same category/region, and (3) random negatives. A ratio of 1:5 (hard:easy) yields the best offline lift (≈6.5%). Hard negatives are combined with 5 hard and 25 easy negatives per positive sample.
PDN Model : Decomposes recommendation into a two‑hop graph. The first hop (Trigger Net) models user interest in interacted items; the second hop (Similarity Net) models similarity between interacted items and the target item. Additional modules (Direct & Bias Net) remove user and position bias. The final score aggregates scores from all paths.
Online Serving : Only the Similarity Net is used to generate an item‑similarity index. Two candidate generation streams are merged: (a) rule‑based similarity based on category, region, and price, and (b) vector similarity from MIND embeddings. The top‑N items with highest similarity are returned for a user’s recent 50 interacted items.
Results : Deploying MIND and PDN in the first‑guess asset recommendation scenario significantly improves core metrics such as exposure‑to‑subscription UV conversion, exposure‑to‑bid UV conversion, and exposure‑UV value. GMV growth is mainly driven by better exploitation of existing user interests, while new‑category discovery remains a challenge.
Conclusion : Tailoring recall models to the unique traits of large assets yields notable gains, but the system now shows signs of information‑filter bubbles. Future work will focus on discovery‑oriented recall to increase user retention and GMV.
DaTaobao Tech
Official account of DaTaobao Technology
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.