Artificial Intelligence 22 min read

Baidu Event Graph: Technology, Construction, and Applications

This article presents Baidu's large‑scale event graph technology, detailing its concepts, construction pipeline, detection, representation, extraction methods, causal relation mining, and real‑world applications such as hotspot discovery, event tracing, resource linking, and causal reasoning, while outlining future research directions.

DataFunTalk
DataFunTalk
DataFunTalk
Baidu Event Graph: Technology, Construction, and Applications

Overview

Baidu has built a million‑scale event graph that updates within minutes and powers search, information streams, and downstream industry services. The article introduces the motivation, core concepts, and the overall architecture of the system.

Event Graph Concepts

An event graph models events, their attributes, and inter‑event relations (temporal, causal, hierarchical). Unlike traditional entity‑centric knowledge graphs, events capture dynamic changes and can be linked to entity graphs.

Technology Overview

The system consists of four layers: data, construction, cognition, and application. The focus is on the construction layer, which includes event detection, representation, extraction, and relation mining.

Event Construction Pipeline

Event detection: fast, accurate, and comprehensive identification of events from news streams using a multi‑task learning framework.

Event representation: automatic discovery of event types and roles from unstructured text, followed by open‑type mining and human verification.

Event extraction: structured extraction of event elements (trigger, type, arguments) via multi‑attribute joint models, reading‑comprehension‑based custom argument extraction, and semantic‑role extraction.

Relation mining: extraction of causal, temporal, co‑reference, and hierarchical relations, with special emphasis on causal reasoning.

Key Techniques

Multi‑task learning to share knowledge among detection models, achieving minute‑level latency and >90% accuracy.

Use of ERNIE pre‑trained language model combined with Bi‑LSTM+CRF for joint attribute extraction.

Reading‑comprehension‑style QA for zero‑shot custom argument extraction.

Semantic‑role abstraction to decouple event types from role vocabularies, enabling efficient multi‑pointer extraction.

Applications

Multi‑dimensional hotspot discovery across industries and regions with minute‑level freshness.

Event timeline/trace generation for users to understand causes and consequences.

Resource association: linking news, images, videos, and other media to events to aid content creation.

Causal reasoning: building a causal graph (≈2 million nodes/edges) for real‑time inference, currently explored in finance.

Dataset and Evaluation

Baidu released the DuEE dataset, the largest Chinese event extraction benchmark, and provides an open leaderboard for community evaluation.

Summary and Outlook

The article concludes with a recap of the event graph’s value, technical achievements, and future directions: enhancing construction capabilities, expanding application scenarios, and deepening industry‑specific explorations.

AIknowledge graphBaiduInformation Extractionevent graphcausal reasoning
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.