Applying NLP and Machine Learning to Classify Tubi User Feedback
This article explains how Tubi leverages natural‑language processing, sentence embeddings (USE and BERT), and LightGBM models to automatically categorize large volumes of Net Promoter Score comments and customer‑support tickets, enabling data‑driven product decisions and workflow automation.
Tubi receives thousands of user comments daily from Net Promoter Score (NPS) surveys and customer‑support tickets, which contain valuable insights but are impractical to read manually. To extract and act on these insights, Tubi first groups feedback into high‑level topics using unsupervised machine‑learning clustering and then tracks topic trends over time.
Two primary use cases are described: (1) NPS comments, where users rate their likelihood to recommend Tubi and provide free‑form reasons, and (2) customer‑support tickets submitted via email or Facebook. For each use case, Tubi defines a set of categories (e.g., Content, App, Ads, Feature/Bug, Request, Error, Inquiry, Complaint, Question) and builds a classification model to assign incoming texts to these categories.
The methodology starts with creating text embeddings. Sentences are transformed into high‑dimensional vectors using pre‑trained encoders such as the Universal Sentence Encoder (USE) and BERT. These embeddings capture semantic similarity, allowing the system to compare sentences like a human would.
After embedding, a supervised classifier is trained on a manually labeled dataset. Tubi chose LightGBM for its efficiency and strong performance on their data, though other algorithms (logistic regression, random forest, Naïve Bayes) were also evaluated.
Evaluation focused on ROC‑AUC, accuracy, and F1 score. USE‑based models consistently outperformed BERT for the custom text‑classification task, delivering higher scores across all three metrics.
In production, the pipeline encodes new comments, feeds the embeddings to the LightGBM model, predicts categories, and stores the results in a dashboard that visualizes volume and trend changes, guiding product‑management decisions.
The article concludes with a roadmap to explore more advanced techniques such as multi‑class models, deep neural networks, larger transformer‑based embeddings (e.g., BERT‑large, XLNet, ELMo), and additional applications like sentiment analysis and keyword extraction.
Bitu Technology
Bitu Technology is the registered company of Tubi's China team. We are engineers passionate about leveraging advanced technology to improve lives, and we hope to use this channel to connect and advance together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.