Bilibili Personal Attack Content Governance: Background, Goals, Methods, and Effectiveness
Bilibili combats personal‑attack and trolling comments by combining sector‑specific keyword databases, user‑group analysis, advanced word‑matching (including pinyin and homophone detection) and multiple NLP/graph models, which has cut personal‑attack reports in entertainment, film and gaming by about 32 % and trolling reports by roughly 25 % between June and December 2023.
Governance Background and Objectives
Bilibili, as a comprehensive video community, faces challenges from personal attacks, insults, and defamatory comments that can damage the community atmosphere. The goal is to reduce exposure of such negative content, promote positive interaction, and create a friendly environment.
Current Situation of Personal Attack Content
Analysis of June 2022 reports shows that the main sources of negative comments are "trolling" and personal attacks. Short abusive terms are limited but their variants (homophones, initial‑letter matches, special characters/emojis) make detection difficult.
Sector‑Specific Issues
Eight sectors (life, entertainment, film, knowledge, technology, sports, games, music) were examined. Entertainment, film, and game sectors have the highest volume and concentration of personal‑attack reports, making them priority targets.
Specialized Governance Process
The process focuses on personal attacks (while also considering trolling). It includes:
Word‑matching identification using techniques such as pinyin recognition, numeric homophone detection, character similarity, and transformation mapping.
Model‑based recognition employing various NLP models (FastText, DPCNN, TextRCNN, Attention, BERT, tiny_bert) and graph‑based methods (GCN/GAT) to capture nuanced abusive language.
Examples of character‑similarity detection use the "phonetic‑shape code" concept to convert characters into machine‑readable numeric strings, enabling recognition of similar‑sounding or similarly‑shaped characters.
Key Strategies
The strategy is built on three dimensions:
Keyword dimension: A knowledge base linking entities to abusive keywords is maintained per sector, allowing precise matching of comments to risky entities.
User‑group dimension: Users are categorized by target (e.g., non‑real entities like celebrities or games vs. platform users/UPs). High‑risk user groups are identified through interaction patterns and relationships.
Content & UP dimension: High‑risk videos or UPs are flagged, and multi‑model fusion (linear regression, logistic regression, tree models) is used to improve recall of abusive content.
Governance Outcomes
After implementation, the proportion of personal‑attack reports in the entertainment, film, and game sectors dropped by 31.97% (from June to December 2023), and trolling reports decreased by 24.77%.
Summary and Outlook
While the reduction trend is evident, further improvements are needed, such as optimizing content ranking, automating negative‑keyword extraction, refining reporting workflows, and accelerating short‑cycle model training and deployment.
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.