Artificial Intelligence 18 min read

OPPO Knowledge Graph: Algorithms, Applications, and Future Directions

This article presents OPPO's large‑scale knowledge graph, detailing the algorithmic challenges and solutions for entity classification, alignment, information extraction, and query parsing, and explains how these techniques power the XiaoBu assistant's knowledge‑based QA, search, and recommendation services while outlining future research directions.

DataFunTalk
DataFunTalk
DataFunTalk
OPPO Knowledge Graph: Algorithms, Applications, and Future Directions

OPPO's knowledge graph, built by the XiaoBu assistant team, is a self‑developed, large‑scale general graph containing hundreds of millions of entities and billions of triples, supporting millions of daily QA requests across devices such as phones, wearables, and IoT hardware.

The system architecture consists of three layers: a low‑level data processing platform with NebulaGraph as the graph database, a middle layer handling data acquisition, graph construction, and management, and a top layer offering applications like intelligent QA, search, content understanding, risk control, and health services.

Core algorithms include entity classification (rule‑based + pre‑trained language model pipeline), a two‑stage entity alignment approach (Dedupe grouping followed by a BERT‑based semantic matcher), and information extraction for both common attributes (using a CASREL pointer‑network model) and long‑tail attributes (using MRC models).

In the XiaoBu assistant, knowledge‑based QA is divided into structured KBQA (handling chain, multi‑variable, entity‑relation, and comparative queries) and unstructured QA (leveraging large‑scale web data and MRC models). Query processing involves domain identification, intent classification, entity recognition, template‑based parsing, and entity linking via a BiLSTM‑CRF recognizer, candidate retrieval, and a disambiguation model.

To address challenges such as voice‑input aliasing, recognition errors, and long‑tail queries, the team employs alias mapping, phonetic features, template pruning, and dual‑tower vector retrieval with BERT embeddings.

The summary concludes with future plans: integrating commonsense reasoning, multimodal graphs, personalized recommendation, large‑scale pre‑trained models, and low‑resource information extraction to further enhance the knowledge graph and its applications.

AIknowledge graphOPPOInformation Extractionsemantic parsingentity alignmententity classification
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.