AntTech
Jul 31, 2023 · Artificial Intelligence
LayoutMask: Enhancing Text-Layout Interaction in Multi-modal Pre-training for Document Understanding
LayoutMask introduces a novel multi-modal pre‑training model that replaces global 1D position with local 1D position and adds Whole Word Masking, Layout‑Aware Masking, and Masked Position Modeling, achieving state‑of‑the‑art results on various visually‑rich document understanding tasks.
AIDocument UnderstandingMultimodal Pretraining
0 likes · 15 min read