Backend Development 8 min read

Integrating Tess4J OCR into a Spring Boot Backend Service

This tutorial walks through setting up a Spring Boot backend, adding the Tess4J OCR library, creating a service and REST controller to recognize text from both local files and remote image URLs, and provides testing steps and deployment tips.

Top Architect
Top Architect
Top Architect
Integrating Tess4J OCR into a Spring Boot Backend Service

This article explains how to integrate the Tess4J OCR engine into a Spring Boot application to recognize text from local and remote images, providing a step‑by‑step guide for developers.

With the rapid advancement of information technology, extracting text from images is increasingly used for data entry and automation, and Tess4J offers a powerful Java wrapper for the Tesseract OCR engine.

Before starting, ensure you have JDK 1.8 or higher, Maven, the latest Spring Boot version, and Tess4J 4.x or newer installed.

Add the following Maven dependency to your pom.xml to include Tess4J:

<dependencies>
    <dependency>
        <groupId>net.sourceforge.tess4j</groupId>
        <artifactId>tess4j</artifactId>
        <version>4.5.4</version>
    </dependency>
    <!-- other dependencies -->
</dependencies>

Download the required language data files (tessdata) from the official repository (e.g., GitCode ) or the provided Baidu Cloud link.

Create an OcrService class that uses Tess4J to perform OCR on a given File or a remote image URL:

@Service
public class OcrService {
    public String recognizeText(File imageFile) throws TesseractException {
        Tesseract tesseract = new Tesseract();
        // Set the path to tessdata if needed
        tesseract.setDatapath("your/tessdata/path");
        tesseract.setLanguage("chi_sim");
        return tesseract.doOCR(imageFile);
    }

    public String recognizeTextFromUrl(String imageUrl) throws Exception {
        URL url = new URL(imageUrl);
        InputStream in = url.openStream();
        Files.copy(in, Paths.get("downloaded.jpg"), StandardCopyOption.REPLACE_EXISTING);
        File imageFile = new File("downloaded.jpg");
        return recognizeText(imageFile);
    }
}

Expose two REST endpoints via an OcrController for uploading images and recognizing text from URLs:

@RestController
@RequestMapping("/api/ocr")
public class OcrController {
    private final OcrService ocrService;

    public OcrController(OcrService ocrService) {
        this.ocrService = ocrService;
    }

    @PostMapping("/upload")
    public ResponseEntity
uploadImage(@RequestParam("file") MultipartFile file) {
        try {
            File convFile = new File(System.getProperty("java.io.tmpdir") + "/" + file.getOriginalFilename());
            file.transferTo(convFile);
            String result = ocrService.recognizeText(convFile);
            return ResponseEntity.ok(result);
        } catch (Exception e) {
            e.printStackTrace();
            return ResponseEntity.badRequest().body("识别发生错误:" + e.getMessage());
        }
    }

    @GetMapping("/recognize-url")
    public ResponseEntity
recognizeFromUrl(@RequestParam("imageUrl") String imageUrl) {
        try {
            String result = ocrService.recognizeTextFromUrl(imageUrl);
            return ResponseEntity.ok(result);
        } catch (Exception e) {
            e.printStackTrace();
            return ResponseEntity.badRequest().body("从URL识别发生错误:" + e.getMessage());
        }
    }
}

Test the service locally by uploading an image file and remotely by providing an image URL; screenshots in the original article illustrate the expected results.

By following these steps you now have a functional Spring Boot service capable of OCR for both local and remote images, which can be further customized for multi‑language support or other specific requirements.

backendJavaOCRSpring BootREST APITess4J
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.