Artificial Intelligence 14 min read

Wang Zhe’s Machine Learning Notes – Answers to Frequently Asked Questions on Recommendation Systems

In this article, Wang Zhe addresses fifteen common questions about recommendation systems, covering topics such as building cross‑domain knowledge, the role of deep reinforcement learning, handling sparse or low‑sample data, offline‑online evaluation, knowledge graphs, graph neural networks, model interpretability, large‑scale ID embedding, and career advice for engineers.

DataFunTalk
DataFunTalk
DataFunTalk
Wang Zhe’s Machine Learning Notes – Answers to Frequently Asked Questions on Recommendation Systems

1. How to quickly build knowledge for both NLP and CV teams when you are unfamiliar with CV? Wang shares his experience transitioning from computational advertising to recommendation systems, emphasizing extensive reading, organizing knowledge, leveraging shared machine‑learning fundamentals, focusing on mainstream CV methods and tools, and adopting a high‑level leadership approach rather than deep technical detail.

2. Do you see deep reinforcement learning (RL) having a promising future in recommendation? He is very optimistic, highlighting RL’s ability to increase online learning frequency, improve real‑time adaptability, and require tight integration with system architecture, data pipelines, and models, making it a key direction for future recommender systems.

3. How to handle (a) few samples with many features, and (b) extremely sparse features? For (a) he suggests tree‑based or traditional classifiers rather than deep models; for (b) large‑scale one‑hot embeddings can still learn useful representations, but with few samples the problem remains challenging.

4. How to ensure consistency between offline training results and online performance? He recommends a systematic evaluation framework that includes offline replay, interleaving tests, and progressive AB testing, acknowledging that perfect offline‑online alignment is impossible due to data bias.

5. What is the role of knowledge graphs in recommendation? Knowledge graphs, powered by graph embeddings and GCNs, complement user‑behavior data, aid cold‑start scenarios, and serve as an effective additional feature source.

6. What to do when offline AUC improves but online metrics drop? He advises not to rely solely on AUC, consider model over‑fitting, data bias, architecture mismatches, and to investigate root causes beyond the metric.

7. How important is recommendation interpretability and what methods exist? While not an expert, he notes that providing reasons for recommendations can boost CTR, and that model‑level and result‑level interpretability are distinct challenges.

8. How to vectorize billions of dynamic IDs in feed streams? Use fast‑updating embeddings, cold‑start strategies such as averaging similar items, and enrich IDs with contextual features like titles, authors, timestamps, and tags.

9. Are Graph Neural Networks (GNNs) more effective than classic neural networks for recommendation? GNNs excel when data naturally forms graphs (e.g., social relations), but their advantage depends on data characteristics rather than being universally superior.

10. What proportion of deep vs. non‑deep models should one study? Start with classic models, then gradually incorporate deep models to build a comprehensive knowledge base.

11. How to evaluate a newly improved recommendation model? Follow a layered evaluation pipeline: offline metrics, replay, interleaving, and finally online AB testing.

12. How to balance work with personal knowledge accumulation (e.g., blogging, writing books)? Allocate a fixed daily time slot (e.g., 10 pm–12 am) for writing and reflection.

13. Advice for students to improve engineering skills? Pursue internships, lab projects, or self‑initiated projects (e.g., building a news‑recommendation system) to demonstrate practical ability alongside academic research.

14. Recommended entry‑level recommendation framework for an e‑commerce site? Begin with collaborative filtering and simple vector‑dot‑product ranking, then iterate on engineering challenges as scale grows.

15. Why are most deep recommendation advances driven by industry rather than academia? Large companies have the data scale and online testing infrastructure needed for breakthroughs, but academic research still provides foundational ideas and novel concepts.

At the end, Wang thanks readers and promotes his new book “Deep Learning Recommendation Systems”.

deep learningrecommendation systemsmodel evaluationreinforcement learningKnowledge Graphgraph neural networksparse features
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.