Tag

document processing

1 views collected around this technical thread.

Sohu Tech Products
Sohu Tech Products
Jan 8, 2025 · Artificial Intelligence

Multimodal RAG: Implementation Paths and Development Prospects

The talk outlines Multimodal RAG implementation routes, comparing OCR‑based object recognition, transformer encoder‑decoder encoding, and Visual Language Model processing, explains the ColPali late‑interaction method for multi‑dimensional vector matching, addresses scaling tensors with binarization and reranking, and recommends a hybrid long‑term strategy where VLM excels on abstract imagery while traditional OCR remains valuable.

ColPaliMultimodal RAGOCR
0 likes · 10 min read
Multimodal RAG: Implementation Paths and Development Prospects
Lobster Programming
Lobster Programming
Nov 1, 2024 · Backend Development

How to Parse PDFs and Extract Metadata with Apache Tika and Spring Boot

This guide explains Apache Tika's document parsing capabilities, shows how to download and run the Tika app, demonstrates extracting text and metadata from a PDF, and provides step‑by‑step instructions for integrating Tika into a Spring Boot project with full code examples.

Apache TikaJavaPDF parsing
0 likes · 7 min read
How to Parse PDFs and Extract Metadata with Apache Tika and Spring Boot
Python Programming Learning Circle
Python Programming Learning Circle
May 13, 2024 · Fundamentals

Using python-docx to Create and Manipulate Word Documents in Python

This article introduces the python-docx library, explains how to install it, and provides a complete code example that demonstrates creating a Word document with headings, styled paragraphs, images, tables, and saving the file, helping Python developers automate document processing.

document processingpython-docxtutorial
0 likes · 5 min read
Using python-docx to Create and Manipulate Word Documents in Python
Python Programming Learning Circle
Python Programming Learning Circle
Feb 18, 2024 · Backend Development

Introduction, Installation, and Usage of PyMuPDF (Python Bindings for MuPDF)

This article provides a comprehensive overview of PyMuPDF, covering its purpose as Python bindings for the lightweight MuPDF viewer, detailed installation instructions, essential dependencies, naming conventions, and extensive usage examples for opening documents, accessing pages, extracting text and images, manipulating PDFs, and saving changes.

MuPDFPDFPyMuPDF
0 likes · 12 min read
Introduction, Installation, and Usage of PyMuPDF (Python Bindings for MuPDF)
Python Programming Learning Circle
Python Programming Learning Circle
Nov 30, 2023 · Fundamentals

Introduction and Usage Guide for PyMuPDF (Python Bindings for MuPDF)

This article provides a comprehensive overview of PyMuPDF, covering its relationship to MuPDF, core features, installation methods, import conventions, and detailed usage examples for opening documents, handling pages, extracting text and images, and performing PDF-specific operations such as merging, splitting, and saving.

MuPDFPDFPyMuPDF
0 likes · 12 min read
Introduction and Usage Guide for PyMuPDF (Python Bindings for MuPDF)
DataFunTalk
DataFunTalk
Nov 10, 2022 · Artificial Intelligence

A Comprehensive Overview of OCR Technology Development and Engineering Practices

This article reviews the 40‑year evolution of Optical Character Recognition, discusses its integration with Intelligent Document Processing, outlines recent research hotspots such as scene text recognition and domain‑specific symbol detection, and shares practical engineering experiences and future directions from Datagrand.

Computer VisionIntelligent Document ProcessingOCR
0 likes · 24 min read
A Comprehensive Overview of OCR Technology Development and Engineering Practices
Sohu Tech Products
Sohu Tech Products
Sep 28, 2022 · Fundamentals

PyMuPDF (Python bindings for MuPDF) – Introduction, Features, Installation and Usage Guide

This article provides a comprehensive overview of PyMuPDF, the Python binding for the lightweight MuPDF library, covering its purpose, supported document formats, key features such as rendering, text extraction and PDF manipulation, installation methods, and detailed code examples for common operations.

MuPDFPDFPyMuPDF
0 likes · 13 min read
PyMuPDF (Python bindings for MuPDF) – Introduction, Features, Installation and Usage Guide
Laiye Technology Team
Laiye Technology Team
Jul 16, 2022 · Artificial Intelligence

Seal (Stamp) Recognition in Intelligent Document Processing: Challenges, Methods, and Experiments

This article explains how intelligent document processing uses deep‑learning‑based seal detection and OCR techniques—enhanced YOLOv5, multi‑label loss, combined NMS, and end‑to‑end models such as Mask‑TextSpotter, ABCNet, PGNet, and TrOCR—to overcome diverse stamp styles, background interference, and image quality issues, presenting experimental results that surpass commercial OCR vendors.

AIOCRdeep learning
0 likes · 13 min read
Seal (Stamp) Recognition in Intelligent Document Processing: Challenges, Methods, and Experiments
Python Programming Learning Circle
Python Programming Learning Circle
May 9, 2022 · Fundamentals

Introduction and Usage Guide for PyMuPDF (Python Bindings for MuPDF)

This article provides a comprehensive overview of PyMuPDF, the Python binding for the lightweight MuPDF library, covering its installation, core features such as page rendering, text and image extraction, PDF manipulation, and detailed code examples for common document‑processing tasks.

MuPDFPDFPyMuPDF
0 likes · 12 min read
Introduction and Usage Guide for PyMuPDF (Python Bindings for MuPDF)
Python Programming Learning Circle
Python Programming Learning Circle
Jan 7, 2022 · Fundamentals

Using python-docx: Document Structure and Basic Operations

This article introduces the python‑docx library, explains its document model—including Document, Paragraph, Run, and Table objects—and provides practical Python code examples for creating, modifying, and styling Word documents, inserting headings, page breaks, tables, and images.

Code Exampledocument processingpython
0 likes · 6 min read
Using python-docx: Document Structure and Basic Operations
Architect
Architect
Jun 22, 2020 · Fundamentals

Fundamentals of Search Engine Architecture: Document Processing, Query Processing, Indexing, and Matching

This article explains the core components and processing steps of a search engine—document processor, query processor, indexing, and matching—detailing how documents are normalized, tokenized, filtered, weighted, and stored in an inverted index to support effective information retrieval.

IndexingInformation Retrievaldocument processing
0 likes · 20 min read
Fundamentals of Search Engine Architecture: Document Processing, Query Processing, Indexing, and Matching
Python Programming Learning Circle
Python Programming Learning Circle
Oct 25, 2019 · Backend Development

Automate Word with Python: Master win32com for Document Manipulation

This tutorial explains how to use Python's win32com library to control Microsoft Word, covering installation, creating and displaying documents, working with Selection, Range, Font, ParagraphFormat, PageSetup and Styles objects, and providing a complete example that formats a document to meet national standards.

comdocument processingpython-automation
0 likes · 14 min read
Automate Word with Python: Master win32com for Document Manipulation