Artificial Intelligence 7 min read

Build a Handwritten Digit Classifier with TensorFlow Softmax Regression (MNIST Tutorial)

This tutorial walks through using TensorFlow to implement a Softmax Regression model that classifies MNIST handwritten digit images, covering dataset basics, model formulation, loss definition, training steps, and a complete Python code example.

360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Build a Handwritten Digit Classifier with TensorFlow Softmax Regression (MNIST Tutorial)

Introduction to MNIST

Learning a programming language usually starts with a "Hello World" example; for TensorFlow the equivalent is the MNIST dataset, a classic computer‑vision benchmark consisting of 70,000 28×28 grayscale images of handwritten digits (0‑9) split into 55,000 training, 10,000 test and 5,000 validation samples.

Four example images (labels 5, 0, 4, 1) are shown below.

Each pixel is converted to a grayscale value in the range [0,1]; the image of digit 1 is displayed as a 28×28 matrix.

The training labels are one‑hot vectors of length 10, e.g., the digit 0 corresponds to [1,0,0,0,0,0,0,0,0,0].

Softmax Regression Model

Softmax regression predicts a probability distribution over the ten digit classes for a given image. The model sums weighted pixel values and passes the result through the softmax function to obtain class probabilities.

The mathematical formulation is:

Applying the softmax function converts these scores into probabilities:

The class with the highest probability is taken as the predicted digit.

Implementation with TensorFlow

The model can be defined in five lines of TensorFlow code:

import tensorflow as tf
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x,W) + b)

Training the Model

We use cross‑entropy as the loss function:

TensorFlow code for the loss:

y_ = tf.placeholder("float", [None,10])
cross_entropy = -tf.reduce_sum(y_*tf.log(y))

Optimization is performed with stochastic gradient descent:

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

Finally, we create a session, initialize variables, and run the training loop:

init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
for i in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

Conclusion

This article demonstrated how to build a Softmax Regression classifier for the MNIST dataset using TensorFlow, providing a simple entry point for further exploration of more advanced TensorFlow models.

Machine LearningpythonTensorFlowMNISTSoftmax Regression
360 Zhihui Cloud Developer
Written by

360 Zhihui Cloud Developer

360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.