Tagged articles
5 articles
Page 1 of 1
HyperAI Super Neural
HyperAI Super Neural
Nov 11, 2025 · Artificial Intelligence

How Deepseek-OCR Achieves SOTA Using Ultra‑Low Visual Token Counts

Deepseek-OCR leverages a visual‑compression approach, combining DeepEncoder and the DeepSeek3B‑MoE‑A570M decoder, to represent document text with far fewer visual tokens, achieving up to 97% OCR accuracy and surpassing GOT‑OCR2.0 and MinerU2.0 on OmniDocBench, while the article offers a one‑click deployment tutorial.

DeepEncoderLLMOCR
0 likes · 6 min read
How Deepseek-OCR Achieves SOTA Using Ultra‑Low Visual Token Counts
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Oct 23, 2025 · Artificial Intelligence

Why Visually‑Rich Document Understanding Looks Like High‑End Docs: A Static Multimodal Overview

The article surveys the evolution of Visually‑Rich Document Understanding (VRDU), highlighting pioneering Chinese OCR research, the LayoutLM family, recent multimodal model breakthroughs, open‑source toolkits, and practical recommendations for handling diverse document types and tasks.

LayoutLMMultimodal OCRVisually-Rich Document Understanding
0 likes · 11 min read
Why Visually‑Rich Document Understanding Looks Like High‑End Docs: A Static Multimodal Overview