Mobile Development 14 min read

Building a Mobile Paper‑Reading App with OpenCV OCR and Text‑to‑Speech

A middle‑aged Android developer recounts breaking his child's "Niu Ting Ting" device, then details how he recreated its functionality by integrating OpenCV‑based paper detection, OCR, and TTS into a mobile app, complete with code snippets and performance results.

Sohu Tech Products
Sohu Tech Products
Sohu Tech Products
Building a Mobile Paper‑Reading App with OpenCV OCR and Text‑to‑Speech

A middle‑aged programmer shares a personal story about accidentally damaging his son’s "Niu Ting Ting" reading device and the costly replacement, which motivates him to build a software version for Android.

He leverages his previous work on Sohu News' AI framework, reusing an OCR module and a text‑to‑speech (TTS) component, and adds a preprocessing pipeline to detect and extract paper regions from photos before OCR.

The preprocessing steps include:

Automatic quadrilateral detection (findContours, approxPolyDP, contourArea, isContourConvex).

Perspective transformation to obtain a flat paper image.

Contrast enhancement and sharpening before OCR.

The core implementation is shown below:

public static Bitmap getPaperBitmapWithDefaultRect(Context context, Uri srcUri, RectF defaultRect) { 
    // 1. Resize source bitmap for performance
    Bitmap recognizeBitmap = ImageUtils.getBitmapWithoutOrientation(context, srcUri, PAPER_RECOGNIZE_WIDTH);
    Mat recognizeMat = new Mat(recognizeBitmap.getHeight(), recognizeBitmap.getWidth(), CvType.CV_8UC3);
    try { Utils.bitmapToMat(recognizeBitmap, recognizeMat); } catch (Exception e) { return null; }
    if (recognizeMat.empty()) return null;
    // 2. Find the largest square (paper edge)
    MatOfPoint recognizeCorners = find_largest_square(find_squares(recognizeMat));
    // 3. Map paper edge back to original image and rotate if needed
    // ... (omitted for brevity) ...
    // 4. Compute perspective transform
    MatOfPoint2f quad_pts = new MatOfPoint2f();
    int padding = PAPER_PADDING;
    Mat quad = Mat.zeros(PAPER_SIZE_HEIGHT, PAPER_SIZE_WIDTH, CvType.CV_8UC3);
    Size size = quad.size();
    quad_pts.push_back(new MatOfPoint2f(new Point(-padding, -padding)));
    quad_pts.push_back(new MatOfPoint2f(new Point(size.width + padding, -padding)));
    quad_pts.push_back(new MatOfPoint2f(new Point(size.width + padding, size.height + padding)));
    quad_pts.push_back(new MatOfPoint2f(new Point(-padding, size.height + padding)));
    srcCorners.convertTo(srcCorners, CvType.CV_32F);
    Mat transmtx = Imgproc.getPerspectiveTransform(srcCorners, quad_pts);
    // 5. Warp perspective
    Imgproc.warpPerspective(srcMat, quad, transmtx, quad.size());
    // 6. Enhance contrast and convert to bitmap
    quad = getGrayContrastMat(quad);
    Bitmap dstBitmap = Bitmap.createBitmap(quad.width(), quad.height(), Bitmap.Config.ARGB_8888);
    Utils.matToBitmap(quad, dstBitmap);
    return dstBitmap;
}

The helper method that finds squares in an image is also provided:

public static List
find_squares(Mat image) { 
    List
contours = new ArrayList<>();
    List
squares = new ArrayList<>();
    Mat blurred = new Mat(image.height(), image.width(), CvType.CV_8UC3);
    Imgproc.GaussianBlur(image, blurred, new Size(11, 11), 0);
    ArrayList
grayList = new ArrayList<>();
    Core.split(blurred, grayList);
    Mat gray0 = new Mat(blurred.size(), CvType.CV_8U);
    Mat gray = new Mat(image.height(), image.width(), CvType.CV_8U);
    for (int a = 0; a < grayList.size(); a++) {
        gray0 = grayList.get(a);
        int threshold_level = 2;
        for (int level = 0; level < threshold_level; level++) {
            Imgproc.Canny(gray0, gray, 10 * (level + 1), 10 * (level + 1));
            Imgproc.dilate(gray, gray, new Mat(), new Point(-1, -1), 1);
            Mat hierarchy = new Mat();
            hierarchy.convertTo(hierarchy, CvType.CV_32SC1);
            Imgproc.findContours(gray, contours, hierarchy, Imgproc.RETR_LIST, Imgproc.CHAIN_APPROX_SIMPLE);
            MatOfPoint2f approx = new MatOfPoint2f();
            for (MatOfPoint contoursPoint : contours) {
                MatOfPoint2f contourPoint2f = new MatOfPoint2f();
                contoursPoint.convertTo(contourPoint2f, CvType.CV_32F);
                Imgproc.approxPolyDP(contourPoint2f, approx, Imgproc.arcLength(contourPoint2f, true) * 0.02, true);
                if (approx.total() == 4) {
                    Point[] approxArray = approx.toArray();
                    MatOfPoint approxPoint = new MatOfPoint(approxArray);
                    if (Math.abs(Imgproc.contourArea(approx)) > 1000 && Imgproc.isContourConvex(approxPoint)) {
                        double maxCosine = 0;
                        for (int j = 2; j < 5; j++) {
                            double cosine = Math.abs(angle(approxArray[j % 4], approxArray[j - 2], approxArray[j - 1]));
                            maxCosine = Math.max(maxCosine, cosine);
                        }
                        if (maxCosine < 0.3) squares.add(approxPoint);
                    }
                }
            }
        }
    }
    return squares;
}

After integrating these steps, the author reports 100% recognition accuracy on printed text and near‑perfect results on handwritten samples, and demonstrates a live camera‑frame version that automatically captures, processes, and reads detected pages.

Despite the successful prototype, the author ultimately purchases a new "Niu Ting Ting" for his son, concluding the narrative with a light‑hearted call for readers to submit articles to the Sohu tech channel.

Mobile DevelopmentAndroidimage processingOCRopencvtext-to-speech
Sohu Tech Products
Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.