NLP-based Text Opinion Extraction and Sentiment Analysis for iQIYI Video Comments
iQIYI’s NLP pipeline—combining CRF‑based segmentation, bidirectional LSTM/GRU models with attention and a CNN classifier—automatically extracts opinion targets, sentiment words and polarity from unstructured video comments, aggregates them across users to reveal collective attitudes toward actors, plot, and visual effects, and guides future work on implicit opinions and broader sentiment domains.
User-generated textual expressions are a crucial component of public opinion data. Natural Language Processing (NLP) techniques can help extract effective information from texts, understand user viewpoints, emotions, and needs. This document introduces iQIYI's technical exploration and practice in text opinion mining and sentiment analysis using comments from TV series.
Background
As a technology‑driven entertainment company, iQIYI aims to provide rich, high‑quality, intelligent services. Analyzing user opinions expressed after watching videos is essential for understanding user preferences. Comments may cover program content, actors, or product feedback. While opinion data can be multi‑modal (text, image, audio), this work focuses on textual comments and explores NLP‑based opinion mining and sentiment analysis.
Examples are drawn from user comments on the drama "You and My Time in the City" (《你和我的倾城时光》), illustrating the concrete analysis process.
Functionality
iQIYI possesses massive video resources, generating abundant bullet comments, episode remarks, and bubble‑chat comments. Each comment is treated as a basic unit for opinion analysis. Although comments are unstructured and informal, NLP pipelines transform them into structured information, extracting opinion targets, opinion words, and sentiment polarity.
For a single‑sentence comment such as “颖宝的演技一直都有进步!期待你和我的倾城时光”, the system can derive:
Overall sentiment polarity: positive.
Opinion targets: “颖宝的演技” and the drama title.
Opinion words: “有进步”, “期待”.
Sentiment toward each target: positive.
Classification of targets into predefined categories (e.g., actor, overall evaluation).
Beyond single‑sentence analysis, the platform aggregates opinions across the user base to reveal collective attitudes toward specific aspects such as actors, plot, or visual effects. Figures illustrate daily sentiment distribution and overall viewpoint classification.
Algorithm and Process
The workflow relies on lexical analysis, opinion extraction, relation extraction, sentiment analysis, and text classification. Lexical analysis, powered by a CRF‑based word segmentation service, provides the foundation for downstream tasks.
1) Opinion Extraction
Opinion targets (the entities being evaluated) and opinion words (the evaluative expressions) are extracted using sequence labeling. A bidirectional LSTM‑CRF model, trained on manually annotated data, achieves strong performance.
Relation extraction determines the link between each opinion word and its target. A bidirectional GRU with attention‑based classification model handles one‑to‑one and many‑to‑many relationships, improving robustness against noisy annotations.
2) Sentiment Analysis
Sentiment is categorized into positive, neutral, and negative. Both sentence‑level sentiment and fine‑grained sentiment toward specific targets are predicted using bidirectional LSTM models enhanced with attention or gating mechanisms.
3) Opinion Aggregation
Aggregated viewpoints are obtained by feeding sentence‑level results into a CNN‑based classification model that summarizes opinions across predefined dimensions (e.g., actor, plot, visual effects).
Conclusion and Future Work
The case study demonstrates how deep‑learning‑driven NLP pipelines can extract and aggregate user opinions and emotions from large‑scale video comments. While the current system handles explicit expressions effectively, future efforts will focus on capturing implicit opinions, handling diverse linguistic styles, and extending the framework to product and artist sentiment analysis.
iQIYI Technical Product Team
The technical product team of iQIYI
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.