Artificial Intelligence 8 min read

PaddleOCR v2.2 Release: PP-Structure for Document Layout Analysis and Table Recognition

PaddleOCR v2.2 launches PP‑Structure, a Python‑installable toolkit that combines PP‑YOLO v2 layout analysis (classifying text, title, table, image, list) with RARE‑based table recognition to extract structured content and export editable Excel files, while supporting custom training and simple command‑line use.

Baidu Geek Talk
Baidu Geek Talk
Baidu Geek Talk
PaddleOCR v2.2 Release: PP-Structure for Document Layout Analysis and Table Recognition

PaddleOCR v2.2 has been released, introducing PP-Structure, a new technology for document layout analysis and table recognition. PP-Structure supports analyzing images of documents to classify them into five categories: text, title, table, image, and list (in collaboration with Layout-Parser). It also supports extracting text, title, image, and list areas as text fields (in collaboration with PP-OCR), and structured analysis of table areas, outputting Excel files. The technology is available as a Python whl package and command-line tool, making it simple and easy to use. It also supports custom training for layout analysis and table structure tasks.

The core technology of PP-Structure includes layout analysis and table recognition. Layout analysis uses PaddleDetection's efficient detection algorithm PP-YOLO v2, achieving mAP of 93.6 and 96.2 on datasets TableBank and PubLayNet, respectively, with a processing time of only 66.6ms on NVIDIA Tesla P40. Table recognition uses the attention-based image description model RARE, which can convert table images into editable Excel files.

The table recognition process involves six main steps: (1) Text detection module, (2) Text recognition module, (3) Table structure prediction module, (4) Cell coordinate aggregation module, (5) Cell text aggregation module, and (6) Excel export module. Each step is designed to handle specific aspects of table recognition, from detecting and recognizing text to predicting table structure and exporting the final Excel file.

PP-Structure is easy to use. After installing the Python whl package, users can quickly try it out with simple code. Detailed documentation is available at https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README_ch.md. For more information and support, users can join the official exchange group by scanning the QR code and replying with 'OCR'.

AIDeep LearningExcel ExportPaddleOCRtext detectiontext recognitiondocument layout analysisPP-Structuretable recognitiontable structure prediction
Baidu Geek Talk
Written by

Baidu Geek Talk

Follow us to discover more Baidu tech insights.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.