Tag

cross‑modal pre‑training

0 views collected around this technical thread.

DataFunSummit
DataFunSummit
Feb 28, 2023 · Artificial Intelligence

Baidu Document Intelligence Technology Overview and Applications

This article presents a comprehensive overview of Baidu's document intelligence technologies—including the ERNIE‑Layout multimodal large model, the prompt‑based DocPrompt extraction system, layout and table understanding techniques, and PaddleNLP open‑source integration—detailing their architectures, challenges, solutions, performance benchmarks, and real‑world application cases across multiple industries.

DocPromptERNIE-LayoutLarge Language Models
0 likes · 19 min read
Baidu Document Intelligence Technology Overview and Applications