Evaluating Large Language Model Item Encoders for Textual Collaborative Filtering in Recommendation Systems
This article investigates whether replacing traditional ID-based item encoders with massive LLMs such as GPT‑3 improves recommendation performance, by conducting extensive experiments on three real‑world datasets, analyzing performance limits, generality of item representations, and comparing against ID‑based and prompt‑based methods.