Design and Implementation of Bilibili's New Customer Service System
This article details Bilibili's transition from a purchased customer‑service platform to a self‑developed system, describing the background, architectural design, core modules such as intelligent QA, seat scheduling, workbench, permission management, the use of Faiss for vector search, and future explorations with large language models, highlighting the technical challenges and solutions across backend development and AI integration.
Background
Bilibili's previous outsourced customer‑service system suffered from low stability, poor scalability, high bug frequency, and inability to integrate with internal products, prompting the decision to build a new in‑house solution.
Overall Architecture and Core Business Flow
The new system consists of a C‑side entry, intelligent QA, seat dispatch, workbench, knowledge base, IM capabilities, ticketing, and permission management, all tightly integrated with internal business services.
Intelligent QA Design
The QA module provides 24/7 automated responses, reducing wait times and handling frequent user queries. Two answer strategies are used: direct answer for high‑confidence matches and a "possible questions" list for medium confidence.
Retrieval‑Based vs. Generation‑Based Models
After evaluating retrieval‑based and generation‑based approaches, the team chose a retrieval‑based solution for higher accuracy in the e‑commerce support scenario.
Vector Search with Faiss
Faiss is employed to perform efficient similarity search on dense vectors generated by a BERT‑based embedding service. The workflow includes data preparation, text vectorization, similarity calculation, matching, and answer selection.
Key Faiss steps:
if len(x) < 1000000:
ivfK = findIVFK(len(x))
else:
ivfK = 65536
factory_str = f'IVF{ivfK}_HNSW32,Flat'Index selection adapts to dataset size and memory constraints. The helper function determines an appropriate IVF parameter:
def findIVFK(N: int):
sqrtN = math.sqrt(N)
print(sqrtN, 4 * sqrtN, 16 * sqrtN)
i = 2
while True:
i *= 2
if 4 * sqrtN <= i <= 16 * sqrtN and N // 256 <= i <= N // 30:
return i
if i > 4096:
return 512Seat Dispatch
The dispatch logic balances load based on each agent's saturation limit, using strategies such as equal distribution, familiar‑customer priority, and last‑service priority. Redis Zset commands (ZADD, ZRANK, ZREM, ZRANGE, ZPOPMIN, ZCOUNT) support queuing and automatic call‑in.
Workbench Technical Challenges
To avoid UI lag when handling many concurrent conversations, the system uses WebWorkers for cache updates, pre‑rendering, and synchronized fetching. Virtual lists and virtual scrolling keep DOM size manageable, while modular component design ensures extensibility for diverse message types.
Permission Management (RBAC)
A role‑based access control model links users, roles, and permissions, allowing flexible assignment of skill groups and ensuring secure operation of the service.
Future Outlook – Large Language Models
The team experimented with LLMs fine‑tuned on real chat logs and knowledge‑base data, achieving more natural responses and better intent recognition, while noting challenges such as forced answers and occasional irrelevance.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.