Tag

layout masking

1 views collected around this technical thread.

AntTech
AntTech
Jul 31, 2023 · Artificial Intelligence

LayoutMask: Enhancing Text-Layout Interaction in Multi-modal Pre-training for Document Understanding

LayoutMask introduces a novel multi-modal pre‑training model that replaces global 1D position with local 1D position and adds Whole Word Masking, Layout‑Aware Masking, and Masked Position Modeling, achieving state‑of‑the‑art results on various visually‑rich document understanding tasks.

AIDocument UnderstandingMultimodal Pretraining
0 likes · 15 min read
LayoutMask: Enhancing Text-Layout Interaction in Multi-modal Pre-training for Document Understanding