Implementing an Automatic Math Expression Grading System with Python and Convolutional Neural Networks
This tutorial walks through building a self‑trained OCR pipeline that generates synthetic digit images, trains a CNN model, segments handwritten math expressions, predicts each character, evaluates the arithmetic result, and overlays checkmarks, crosses or answers onto the original image.
The article describes how to create an automatic grading tool for handwritten arithmetic problems using Python. It starts by explaining the two essential tasks: recognizing digits and segmenting them from an image.
Data Generation
Instead of using the MNIST dataset, the author generates custom images by drawing characters with various fonts, sizes and rotation angles. A small script creates 24×24 pixel PNGs for each of the 15 symbols (0‑9, =, +, -, ×, ÷) across 13 fonts and 20 rotation angles, resulting in 3,900 images per class.
<code>from __future__ import print_function
from PIL import Image, ImageFont, ImageDraw
import os, shutil, time
# label dictionary
label_dict = {0:'0', 1:'1', 2:'2', 3:'3', 4:'4', 5:'5', 6:'6', 7:'7', 8:'8', 9:'9', 10:'=', 11:'+', 12:'-', 13:'×', 14:'÷'}
# create folders and generate images for each font and rotation
for font_name in os.listdir('./fonts'):
font_path = os.path.join('./fonts', font_name)
for angle in range(-10, 10):
makeImage(label_dict, font_path, rotate=angle)
</code>Model Construction
A simple CNN is built with TensorFlow/Keras: input rescaling, two Conv2D‑MaxPooling blocks, flattening, a dense layer of 128 units, and a final dense layer with 15 outputs. The model is compiled with Adam optimizer and sparse categorical cross‑entropy loss.
<code>def create_model():
model = Sequential([
layers.experimental.preprocessing.Rescaling(1./255, input_shape=(24,24,1)),
layers.Conv2D(24,3,activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,3,activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(15)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
return model
</code>Training
The generated dataset is loaded with image_dataset_from_directory , cached, shuffled and prefetched. The model is trained for 10 epochs, reaching near‑100% accuracy. Weights are saved to checkpoint/char_checkpoint .
<code>model = create_model()
model.fit(train_ds, epochs=10)
model.save_weights('checkpoint/char_checkpoint')
</code>Prediction
Two sample images (e.g., a 6 and an 8) are read with OpenCV, converted to grayscale, and fed to the trained model. The np.argmax of the softmax output yields the predicted character.
<code>imgs = np.array([img1, img2])
predicts = model.predict(imgs)
results = [class_name[np.argmax(p)] for p in predicts]
print(results)
</code>Image Segmentation
To handle full‑page worksheets, the author uses projection profiles. Vertical (Y‑axis) projection identifies rows, while horizontal (X‑axis) projection after dilation isolates individual characters. Functions img_y_shadow , img2rows , img_x_shadow , and block2chars perform these calculations, returning bounding boxes for each character.
<code># Example: compute Y‑axis projection
def img_y_shadow(img_b):
h, w = img_b.shape
a = [0 for _ in range(h)]
for i in range(h):
for j in range(w):
if img_b[i, j] == 255:
a[i] += 1
return a
</code>Recognition of Segmented Characters
Each cropped character image is resized to 24×24, stacked into a batch, and passed through the CNN. The predictions are collected per block, forming the original arithmetic expression.
Evaluation and Feedback
The recognized string is evaluated with Python's eval (after replacing × and ÷ with * and /). The result is compared to the provided answer; a checkmark (green), cross (red) or placeholder (gray) is drawn onto the original image using Pillow.
<code># Draw result on original image
def cv2ImgAddText(img, text, left, top, textColor=(255,0,0), textSize=20):
if isinstance(img, np.ndarray):
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
draw = ImageDraw.Draw(img)
font = ImageFont.truetype('fonts/fangzheng_shusong.ttf', textSize)
draw.text((left, top), text, textColor, font=font)
return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)
</code>The final output image shows each expression annotated with a green check for correct answers, a red cross for wrong ones, and a gray placeholder where the answer is missing.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.