Implementing OCR in Java with SpringBoot and Tess4j
This article demonstrates how to build a lightweight OCR service in Java using SpringBoot and the Tess4j library, covering dependency setup, Tesseract engine initialization, RESTful API creation, training data options, and deployment considerations.
1. Introduction
Have you ever needed to copy useful text from an image but had to type it manually? Java can easily perform OCR (Optical Character Recognition) without large external tools or complex configuration—just a few lines of code.
2. Feature Demonstration
First, see the final effect before implementation.
3. Implementation
1. Description
We use SpringBoot together with Tess4j, a Java wrapper for Tesseract, to implement OCR. Tess4j enables easy OCR integration for scanned documents, image text extraction, and screenshot reading, and can be exposed as a lightweight RESTful service.
2. Code Implementation
2.1 Add Dependency
<dependency>
<groupId>net.sourceforge.tess4j</groupId>
<artifactId>tess4j</artifactId>
</dependency>2.2 Initialize Tesseract Engine
Project deployment:
Use new ClassPathResource("tess_data").getFile().getAbsolutePath() may fail after packaging; consider copying the resource with a utility like TensorflowUtil before loading.
On Linux, ensure net.sourceforge.tess4j.TessAPI can be initialized and all native dependencies are present.
Training data:
tessdata_best : high‑accuracy, slower.
tessdata : balanced speed and accuracy.
tessdata_fast : fast, lower accuracy.
/**
* TesseractOcr model loader
* @author : YiFei
*/
@Slf4j
@Getter
@Component
public class TesseractOcrModelService {
private final Tesseract tesseract = new Tesseract();
public TesseractOcrModelService() {
try {
// Get training data folder (may fail in jar, use TensorflowUtil instead)
String folderPath = new ClassPathResource("tess_data").getFile().getAbsolutePath();
tesseract.setPageSegMode(OEM_TESSERACT_LSTM_COMBINED);
tesseract.setDatapath(folderPath);
tesseract.setPageSegMode(6);
tesseract.setLanguage("chi_sim");
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}2.3 Write RESTful Interface
/**
* OCR Controller
* @author : YiFei
*/
@RestController
@RequestMapping("ocr")
@RequiredArgsConstructor
public class OcrController {
private final TesseractOcrModelService tesseractOcrModelService;
@PostMapping("/detection")
public Result
ocrDetection(MultipartFile file) {
try {
// Recommended image preprocessing: binarization, denoising, rotation correction
Tesseract tesseract = tesseractOcrModelService.getTesseract();
return Result.success(tesseract.doOCR(ImageIO.read(file.getInputStream())));
} catch (Exception e) {
throw new RuntimeException("ImageIO.read(file.getInputStream()) parsing error");
}
}
}4. Source Code
https://gitee.com/fateyifei/yf
5. Conclusion
Tess4j works well for ID numbers, phone numbers, and English words, but its free training data yields poorer Chinese recognition. For higher quality you can:
Specialized training: train custom datasets for specific text types.
Third‑party APIs: use services like Google Cloud Vision, Microsoft Azure OCR, or Amazon Textract for better accuracy.
Additional use cases include document digitization, automatic data entry, license‑plate recognition, and handwriting recognition.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.