Artificial Intelligence 5 min read

Master CNN Basics: Build, Train, and Evaluate a Convolutional Neural Network

This article introduces the fundamentals of convolutional neural networks (CNN), explains key layers such as convolution, pooling, and fully connected layers, and provides a step‑by‑step Python implementation using Keras to load data, construct, compile, train, and evaluate a CNN model on the digits dataset.

Model Perspective
Model Perspective
Model Perspective
Master CNN Basics: Build, Train, and Evaluate a Convolutional Neural Network

Convolutional Neural Network Basics

Basic CNN: CNN is similar to a multilayer perceptron (MLP) because both are feed‑forward neural networks, but CNN has distinct features such as convolutional layers, pooling layers, and fully‑connected layers.

Create Model

Convolutional Layer

Generally use 2‑D convolution to process image data.

Filter size (kernel_size) defines the receptive field width and height.

Number of filters (filters) equals the depth of the next layer.

Stride (strides) is the distance the filter moves each step.

Padding can be added to prevent the feature map from becoming too small.

Activation Layer

Same as activation layers in MLP.

Pooling Layer

Used to reduce the number of parameters.

Fully Connected Layer

Convolution and pooling layers can be connected to a fully‑connected layer.

Model Compilation and Training

Load data

<code>import numpy as np
import matplotlib.pyplot as plt

from sklearn import datasets
from sklearn.model_selection import train_test_split
from keras.utils.np_utils import to_categorical

data = datasets.load_digits()

plt.imshow(data.images[0])    # show first number in the dataset
plt.show()
print('label: ', data.target[0])    # label = '0'
</code>

Data splitting

<code>X_data = data.images
y_data = data.target
X_data = X_data.reshape((X_data.shape[0], X_data.shape[1], X_data.shape[2], 1))
# one-hot encoding of y_data
y_data = to_categorical(y_data)
# partition data into train/test sets
X_train, X_test, y_train, y_test = train_test_split(X_data, y_data, test_size = 0.3, random_state = 777)
</code>

Compile model

<code>from keras.models import Sequential
from keras import optimizers
from keras.layers import Dense, Activation, Flatten, Conv2D, MaxPooling2D
model = Sequential()
# convolution layer
model.add(Conv2D(input_shape = (X_data.shape[1], X_data.shape[2], X_data.shape[3]), filters = 10, kernel_size = (3,3), strides = (1,1), padding = 'valid'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Flatten())
# dense layer with 50 neurons
model.add(Dense(50, activation = 'relu'))
# final layer with 10 neurons to classify the instances
model.add(Dense(10, activation = 'softmax'))
adam = optimizers.Adam(lr = 0.001)
model.compile(loss = 'categorical_crossentropy', optimizer = adam, metrics = ['accuracy'])
history = model.fit(X_train, y_train, batch_size = 50, validation_split = 0.2, epochs = 100, verbose = 0)
</code>

Training results

<code>plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.legend(['training', 'validation'], loc = 'upper left')
plt.show()
</code>

Model evaluation

<code>results = model.evaluate(X_test, y_test)
print('Test accuracy: ', results[1])
</code>
CNNmachine learningPythondeep learningKeras
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.