Tag

chart extraction

1 views collected around this technical thread.

Baidu Geek Talk
Baidu Geek Talk
Jul 26, 2021 · Artificial Intelligence

Document Rendering and Structured Extraction Techniques in Baidu Wenku

Baidu Wenku converts all document types to PDF, parses the PDF into a proprietary format, uses absolute‑position layout for PC rendering, and transforms this into flow‑type structural data for mobile devices by re‑typing layout, extracting OOXML structures, and detecting charts, thereby enabling adaptive layouts, accurate formula rendering, and interactive chart extraction.

Mobile OptimizationOOXML parsingPDF conversion
0 likes · 12 min read
Document Rendering and Structured Extraction Techniques in Baidu Wenku