Artificial Intelligence 6 min read

Integrating Tess4J OCR into a Spring Boot Application

This guide walks through setting up a Spring Boot project, adding Tess4J dependencies, configuring language data, implementing an OCR service class, exposing REST endpoints for local and remote image recognition, and testing the OCR functionality end‑to‑end.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
Integrating Tess4J OCR into a Spring Boot Application

In this tutorial we explore how to integrate Tess4J, a Java wrapper for the Tesseract OCR engine, into a Spring Boot application to recognize text from both local and remote images.

Background : As image‑based text extraction becomes increasingly important for data entry and automation, Tess4J provides a powerful interface for OCR in Java applications. Integrating it with Spring Boot enables a clean, service‑oriented solution.

Part 1 – Environment Setup : Ensure you have JDK 1.8+, Maven, the latest Spring Boot version, and Tess4J 4.x or newer.

Part 2 – Add Dependency (pom.xml):

<dependencies>
    <dependency>
        <groupId>net.sourceforge.tess4j</groupId>
        <artifactId>tess4j</artifactId>
        <version>4.5.4</version>
    </dependency>
    <!-- other dependencies -->
</dependencies>

Make sure the versions match your development environment.

Part 3 – Add Tessdata Language Pack : Download the required language files (e.g., chi_sim.traineddata ) from the official tessdata repository (https://gitcode.com/tesseract-ocr/tessdata/tree/main) or the provided Baidu Cloud link.

Part 4 – Create OCR Service Class :

@Service
public class OcrService {

    public String recognizeText(File imageFile) throws TesseractException {
        Tesseract tesseract = new Tesseract();
        // Set the path to tessdata (optional for standard English)
        tesseract.setDatapath("
");
        tesseract.setLanguage("chi_sim");
        return tesseract.doOCR(imageFile);
    }

    public String recognizeTextFromUrl(String imageUrl) throws Exception {
        URL url = new URL(imageUrl);
        InputStream in = url.openStream();
        Files.copy(in, Paths.get("downloaded.jpg"), StandardCopyOption.REPLACE_EXISTING);
        File imageFile = new File("downloaded.jpg");
        return recognizeText(imageFile);
    }
}

The recognizeText(File) method handles OCR for a local file, while recognizeTextFromUrl(String) downloads a remote image before processing.

Part 5 – Build REST Controller :

@RestController
@RequestMapping("/api/ocr")
public class OcrController {

    private final OcrService ocrService;

    // Constructor injection
    public OcrController(OcrService ocrService) {
        this.ocrService = ocrService;
    }

    @PostMapping("/upload")
    public ResponseEntity
uploadImage(@RequestParam("file") MultipartFile file) {
        try {
            File convFile = new File(System.getProperty("java.io.tmpdir") + "/" + file.getOriginalFilename());
            file.transferTo(convFile);
            String result = ocrService.recognizeText(convFile);
            return ResponseEntity.ok(result);
        } catch (Exception e) {
            e.printStackTrace();
            return ResponseEntity.badRequest().body("Recognition error: " + e.getMessage());
        }
    }

    @GetMapping("/recognize-url")
    public ResponseEntity
recognizeFromUrl(@RequestParam("imageUrl") String imageUrl) {
        try {
            String result = ocrService.recognizeTextFromUrl(imageUrl);
            return ResponseEntity.ok(result);
        } catch (Exception e) {
            e.printStackTrace();
            return ResponseEntity.badRequest().body("URL recognition error: " + e.getMessage());
        }
    }
}

The controller exposes two endpoints: /api/ocr/upload for local file uploads and /api/ocr/recognize-url for processing images from a URL.

Part 6 – Testing : Use tools like Postman or curl to POST a local image to /api/ocr/upload and GET /api/ocr/recognize-url?imageUrl=YOUR_IMAGE_URL for remote testing. Screenshots in the original article illustrate successful local and remote OCR results.

Conclusion : Following these steps gives you a functional Spring Boot service capable of OCR on both local and remote images. Adjust the language pack and configuration as needed for multilingual scenarios, and consider further optimizations for production use.

JavaOCRSpring BootREST APITesseractTess4J
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.