Source: Pixabay
One of the most interesting tasks in deep learning is to recognize objects in natural settings. The ability to process visual information using machine learning algorithms can be very useful as demonstrated in various computer vision fields.
The SVHN dataset contains over 600,000 labeled digits cropped from street-level photos. It is one of the most popular image recognition datasets. It has been used in neural networks created by Google to improve the map quality by automatically transcribing the address numbers from a patch of pixels. The transcribed number with a known street address helps pinpoint the location of the building it represents.
Objective is to predict the number depicted inside the image by using Artificial or Fully Connected Feed Forward Neural Networks and Convolutional Neural Networks. We will go through various models of each and finally select the one that is giving us the best performance.
Here, we will use a subset of the original data to save some computation time. The dataset is provided as a .h5 file. The basic preprocessing steps have been applied on the dataset.
Let us start by mounting the Google drive. You can run the below cell to mount the Google drive.
# from google.colab import drive
# drive.mount('/content/drive/')
import numpy as np
import pandas as pd
# plotting library imports
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
import tensorflow as tf
from tensorflow.keras import regularizers
# Keras Sequential Model
from tensorflow.keras.models import Sequential
# Importing different layers and optimizers
from tensorflow.keras.layers import Activation, BatchNormalization, Conv2D, Dense, Dropout, ELU, Flatten, LeakyReLU, MaxPooling2D, ReLU
from tensorflow.keras.losses import categorical_crossentropy
tf.keras.optimizers.legacy.Adam
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.utils import to_categorical
# The below code can be used to ignore the warnings that may occur due to deprecations
import warnings
warnings.filterwarnings("ignore")
Let us check the version of tensorflow.
print(tf.__version__)
2.13.0-rc1
import h5py
# Open the file as read only in Data directory
h5f = h5py.File('Data/SVHN_single_grey1.h5', 'r')
# Create train and test the datasets
X_train = h5f['X_train'][:]
y_train = h5f['y_train'][:]
X_test = h5f['X_test'][:]
y_test = h5f['y_test'][:]
# Close file
h5f.close()
Check the number of images in the training and the testing dataset.
len(X_train), len(X_test)
(42000, 18000)
Observation:
# Visualizing the first 10 images in the dataset with target labels
plt.figure(figsize = (10, 1))
for i in range(10):
# subplot parameters: subplot(nrows, ncolumns, index, **kwaargs)
plt.subplot(1, 10, i+1)
# cmap='gray', RGB: vmin=0, vmax=255
plt.imshow(X_train[i], cmap = "gray")
plt.axis('off')
plt.show()
print('Target label for each of the above images: %s' % (y_train[0:10]))
Target label for each of the above images: [2 6 7 4 4 0 3 0 7 3]
print('X_train 1st Image Shape:', X_train[0].shape, '\n')
print('First Image Array: \n', X_train[0])
X_train 1st Image Shape: (32, 32) First Image Array: [[ 33.0704 30.2601 26.852 ... 71.4471 58.2204 42.9939] [ 25.2283 25.5533 29.9765 ... 113.0209 103.3639 84.2949] [ 26.2775 22.6137 40.4763 ... 113.3028 121.775 115.4228] ... [ 28.5502 36.212 45.0801 ... 24.1359 25.0927 26.0603] [ 38.4352 26.4733 23.2717 ... 28.1094 29.4683 30.0661] [ 50.2984 26.0773 24.0389 ... 49.6682 50.853 53.0377]]
# Reshaping the 2D image datasets to 1D
X_train = X_train.reshape(X_train.shape[0], 1024)
X_test = X_test.reshape(X_test.shape[0], 1024)
# Normalization/Scaling - dividing each input by the range of image pixel values
X_train_normalized = X_train.astype('float32')/255.0
X_test_normalized = X_test.astype('float32')/255.0
Print the shapes of Training and Test data
# Train and Test Reshaping
print('Training set:', X_train.shape, y_train.shape)
print('Test set:', X_test.shape, y_test.shape)
Training set: (42000, 1024) (42000,) Test set: (18000, 1024) (18000,)
The output for this classification problem will have 10 neurons, thus each categorical variable is transfromed into a binary vector length equal to the number of categories. This allows the categorical features to be treated as numercial and enables the model to learn the relationships between the categories and the target variable.
# one-hot encoded representation of target labels
y_train_encoded = to_categorical(y_train)
y_test_encoded = to_categorical(y_test)
y_test_encoded
array([[0., 1., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 1., 0., 0.],
[0., 0., 1., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 1., 0., 0.],
[0., 0., 0., ..., 0., 0., 1.],
[0., 0., 1., ..., 0., 0., 0.]], dtype=float32)
Observation:
np.random.seed(23)
import random
random.seed(23)
tf.random.set_seed(23)
# 1st Sequential ANN Model Functionj
def nn_model_1():
ann_model = Sequential()
# First hidden layer with 64 neurons
ann_model.add(Dense(64, activation='relu', input_shape=(1024,)))
# Second hidden layer with 32 neurons
ann_model.add(Dense(32, activation='relu'))
# Output layer with 10 nodes
ann_model.add(Dense(10, activation='softmax'))
optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=0.001)
# Compile the model
ann_model.compile(optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
return ann_model
# Call model build function
ann_model_1 = nn_model_1()
# Model summary
ann_model_1.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 64) 65600
dense_1 (Dense) (None, 32) 2080
dense_2 (Dense) (None, 10) 330
=================================================================
Total params: 68010 (265.66 KB)
Trainable params: 68010 (265.66 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
history_1 = ann_model_1.fit(
X_train_normalized, y_train_encoded,
epochs=20,
validation_split=0.2,
shuffle=True,
batch_size=128,
verbose=1
)
Epoch 1/20 263/263 [==============================] - 0s 1ms/step - loss: 2.2948 - accuracy: 0.1184 - val_loss: 2.2626 - val_accuracy: 0.1532 Epoch 2/20 263/263 [==============================] - 0s 904us/step - loss: 2.1622 - accuracy: 0.1978 - val_loss: 2.0119 - val_accuracy: 0.2925 Epoch 3/20 263/263 [==============================] - 0s 847us/step - loss: 1.8222 - accuracy: 0.3777 - val_loss: 1.6517 - val_accuracy: 0.4569 Epoch 4/20 263/263 [==============================] - 0s 872us/step - loss: 1.5627 - accuracy: 0.4831 - val_loss: 1.5045 - val_accuracy: 0.5019 Epoch 5/20 263/263 [==============================] - 0s 850us/step - loss: 1.4450 - accuracy: 0.5259 - val_loss: 1.4241 - val_accuracy: 0.5349 Epoch 6/20 263/263 [==============================] - 0s 844us/step - loss: 1.3627 - accuracy: 0.5605 - val_loss: 1.3172 - val_accuracy: 0.5802 Epoch 7/20 263/263 [==============================] - 0s 830us/step - loss: 1.3006 - accuracy: 0.5843 - val_loss: 1.2751 - val_accuracy: 0.5995 Epoch 8/20 263/263 [==============================] - 0s 826us/step - loss: 1.2576 - accuracy: 0.6033 - val_loss: 1.2540 - val_accuracy: 0.6044 Epoch 9/20 263/263 [==============================] - 0s 817us/step - loss: 1.2163 - accuracy: 0.6202 - val_loss: 1.2116 - val_accuracy: 0.6210 Epoch 10/20 263/263 [==============================] - 0s 860us/step - loss: 1.1828 - accuracy: 0.6309 - val_loss: 1.1658 - val_accuracy: 0.6400 Epoch 11/20 263/263 [==============================] - 0s 844us/step - loss: 1.1519 - accuracy: 0.6430 - val_loss: 1.1448 - val_accuracy: 0.6486 Epoch 12/20 263/263 [==============================] - 0s 840us/step - loss: 1.1321 - accuracy: 0.6489 - val_loss: 1.1422 - val_accuracy: 0.6385 Epoch 13/20 263/263 [==============================] - 0s 833us/step - loss: 1.1084 - accuracy: 0.6591 - val_loss: 1.1306 - val_accuracy: 0.6461 Epoch 14/20 263/263 [==============================] - 0s 812us/step - loss: 1.0916 - accuracy: 0.6657 - val_loss: 1.0914 - val_accuracy: 0.6655 Epoch 15/20 263/263 [==============================] - 0s 853us/step - loss: 1.0710 - accuracy: 0.6742 - val_loss: 1.1593 - val_accuracy: 0.6276 Epoch 16/20 263/263 [==============================] - 0s 843us/step - loss: 1.0600 - accuracy: 0.6785 - val_loss: 1.0607 - val_accuracy: 0.6756 Epoch 17/20 263/263 [==============================] - 0s 839us/step - loss: 1.0446 - accuracy: 0.6818 - val_loss: 1.0454 - val_accuracy: 0.6786 Epoch 18/20 263/263 [==============================] - 0s 846us/step - loss: 1.0346 - accuracy: 0.6862 - val_loss: 1.0320 - val_accuracy: 0.6848 Epoch 19/20 263/263 [==============================] - 0s 807us/step - loss: 1.0140 - accuracy: 0.6936 - val_loss: 1.0292 - val_accuracy: 0.6839 Epoch 20/20 263/263 [==============================] - 0s 843us/step - loss: 1.0102 - accuracy: 0.6943 - val_loss: 1.0255 - val_accuracy: 0.6817
plt.plot(history_1.history['accuracy'])
plt.plot(history_1.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
Observations:
Let's build one more model with higher complexity and see if we can improve the performance of the model.
First, we need to clear the previous model's history from the Keras backend. Also, let's fix the seed again after clearing the backend.
# Clearing the backend
from tensorflow.keras import backend
backend.clear_session()
np.random.seed(23)
import random
random.seed(23)
tf.random.set_seed(23)
# 2nd Sequential ANN Model Function
def nn_model_2():
ann_model = Sequential()
# First hidden layer with 256 neurons
ann_model.add(Dense(256, activation='relu', input_shape=(1024,)))
# Second hidden layer with 128 neurons
ann_model.add(Dense(128, activation='relu'))
# Dropout layer with the rate = 0.2
ann_model.add(Dropout(0.2))
# Third hidden layer with 64 neurons
ann_model.add(Dense(64, activation='relu'))
# Fourth hidden layer with 64 neurons
ann_model.add(Dense(64, activation='relu'))
# Fifth hidden layer with 32 neurons
ann_model.add(Dense(32, activation='relu'))
# Output layer with 10 nodes
ann_model.add(Dense(10, activation='softmax'))
optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=0.0005)
# Compile the model
ann_model.compile(optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
return ann_model
# Call the second model build function
ann_model_2 = nn_model_2()
# Model summary
ann_model_2.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 256) 262400
dense_1 (Dense) (None, 128) 32896
dropout (Dropout) (None, 128) 0
dense_2 (Dense) (None, 64) 8256
dense_3 (Dense) (None, 64) 4160
dense_4 (Dense) (None, 32) 2080
dense_5 (Dense) (None, 10) 330
=================================================================
Total params: 310122 (1.18 MB)
Trainable params: 310122 (1.18 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
history_2 = ann_model_2.fit(
X_train_normalized, y_train_encoded,
epochs=30,
validation_split=0.2,
batch_size=128,
verbose=1
)
Epoch 1/30 263/263 [==============================] - 1s 2ms/step - loss: 2.3048 - accuracy: 0.1017 - val_loss: 2.3026 - val_accuracy: 0.0995 Epoch 2/30 263/263 [==============================] - 1s 2ms/step - loss: 2.3015 - accuracy: 0.1067 - val_loss: 2.2755 - val_accuracy: 0.1271 Epoch 3/30 263/263 [==============================] - 1s 2ms/step - loss: 2.1018 - accuracy: 0.1936 - val_loss: 1.9201 - val_accuracy: 0.2598 Epoch 4/30 263/263 [==============================] - 1s 2ms/step - loss: 1.9036 - accuracy: 0.2707 - val_loss: 1.7728 - val_accuracy: 0.3487 Epoch 5/30 263/263 [==============================] - 1s 2ms/step - loss: 1.7035 - accuracy: 0.3910 - val_loss: 1.5928 - val_accuracy: 0.4440 Epoch 6/30 263/263 [==============================] - 1s 2ms/step - loss: 1.5033 - accuracy: 0.4826 - val_loss: 1.3605 - val_accuracy: 0.5532 Epoch 7/30 263/263 [==============================] - 1s 2ms/step - loss: 1.3923 - accuracy: 0.5307 - val_loss: 1.3305 - val_accuracy: 0.5623 Epoch 8/30 263/263 [==============================] - 1s 2ms/step - loss: 1.3288 - accuracy: 0.5584 - val_loss: 1.2655 - val_accuracy: 0.5782 Epoch 9/30 263/263 [==============================] - 1s 2ms/step - loss: 1.2729 - accuracy: 0.5785 - val_loss: 1.1966 - val_accuracy: 0.6119 Epoch 10/30 263/263 [==============================] - 1s 2ms/step - loss: 1.2294 - accuracy: 0.5935 - val_loss: 1.1612 - val_accuracy: 0.6173 Epoch 11/30 263/263 [==============================] - 1s 2ms/step - loss: 1.1704 - accuracy: 0.6185 - val_loss: 1.1019 - val_accuracy: 0.6457 Epoch 12/30 263/263 [==============================] - 1s 2ms/step - loss: 1.1394 - accuracy: 0.6297 - val_loss: 1.0631 - val_accuracy: 0.6648 Epoch 13/30 263/263 [==============================] - 1s 2ms/step - loss: 1.0940 - accuracy: 0.6468 - val_loss: 1.0306 - val_accuracy: 0.6694 Epoch 14/30 263/263 [==============================] - 1s 2ms/step - loss: 1.0655 - accuracy: 0.6579 - val_loss: 0.9967 - val_accuracy: 0.6801 Epoch 15/30 263/263 [==============================] - 1s 2ms/step - loss: 1.0372 - accuracy: 0.6676 - val_loss: 0.9917 - val_accuracy: 0.6862 Epoch 16/30 263/263 [==============================] - 1s 2ms/step - loss: 1.0077 - accuracy: 0.6797 - val_loss: 0.9539 - val_accuracy: 0.6961 Epoch 17/30 263/263 [==============================] - 1s 2ms/step - loss: 0.9943 - accuracy: 0.6842 - val_loss: 0.9287 - val_accuracy: 0.7044 Epoch 18/30 263/263 [==============================] - 1s 2ms/step - loss: 0.9682 - accuracy: 0.6932 - val_loss: 0.9122 - val_accuracy: 0.7108 Epoch 19/30 263/263 [==============================] - 1s 2ms/step - loss: 0.9464 - accuracy: 0.6999 - val_loss: 0.9276 - val_accuracy: 0.7045 Epoch 20/30 263/263 [==============================] - 1s 2ms/step - loss: 0.9334 - accuracy: 0.7054 - val_loss: 0.8678 - val_accuracy: 0.7286 Epoch 21/30 263/263 [==============================] - 1s 2ms/step - loss: 0.9096 - accuracy: 0.7121 - val_loss: 0.8801 - val_accuracy: 0.7250 Epoch 22/30 263/263 [==============================] - 1s 2ms/step - loss: 0.8969 - accuracy: 0.7169 - val_loss: 0.8454 - val_accuracy: 0.7370 Epoch 23/30 263/263 [==============================] - 1s 2ms/step - loss: 0.8766 - accuracy: 0.7228 - val_loss: 0.9111 - val_accuracy: 0.7094 Epoch 24/30 263/263 [==============================] - 1s 2ms/step - loss: 0.8827 - accuracy: 0.7205 - val_loss: 0.8701 - val_accuracy: 0.7208 Epoch 25/30 263/263 [==============================] - 1s 2ms/step - loss: 0.8545 - accuracy: 0.7318 - val_loss: 0.8828 - val_accuracy: 0.7155 Epoch 26/30 263/263 [==============================] - 1s 2ms/step - loss: 0.8510 - accuracy: 0.7328 - val_loss: 0.8179 - val_accuracy: 0.7408 Epoch 27/30 263/263 [==============================] - 1s 2ms/step - loss: 0.8342 - accuracy: 0.7358 - val_loss: 0.8012 - val_accuracy: 0.7527 Epoch 28/30 263/263 [==============================] - 1s 2ms/step - loss: 0.8288 - accuracy: 0.7385 - val_loss: 0.8188 - val_accuracy: 0.7443 Epoch 29/30 263/263 [==============================] - 1s 2ms/step - loss: 0.8190 - accuracy: 0.7401 - val_loss: 0.8004 - val_accuracy: 0.7469 Epoch 30/30 263/263 [==============================] - 1s 2ms/step - loss: 0.8176 - accuracy: 0.7401 - val_loss: 0.7857 - val_accuracy: 0.7529
ANN Model 2 Evaluation
plt.plot(history_2.history['accuracy'])
plt.plot(history_2.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
Observations:
accuracy = ann_model_2.evaluate(X_test_normalized, y_test_encoded, verbose=2)
563/563 - 0s - loss: 0.7904 - accuracy: 0.7569 - 242ms/epoch - 430us/step
# Predictions as probabilities for each category
y_pred = ann_model_2.predict(X_test_normalized)
563/563 [==============================] - 0s 411us/step
Note: Earlier, we noticed that each entry of the target variable is a one-hot encoded vector but to print the classification report and confusion matrix, we must convert each entry of y_test to a single label.
# Converting each entry to single label from one-hot encoded vector
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred_arg=np.argmax(y_pred,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)
from sklearn.metrics import classification_report
# Classification report for true and predicted values
print(classification_report(y_test_arg, y_pred_arg))
# Plotting the Confusion Matrix using confusion matrix() from the tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg, y_pred_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
confusion_matrix,
annot=True,
linewidths=.4,
fmt="d",
square=True,
ax=ax
)
precision recall f1-score support
0 0.73 0.80 0.77 1814
1 0.69 0.83 0.76 1828
2 0.85 0.73 0.79 1803
3 0.72 0.72 0.72 1719
4 0.79 0.82 0.80 1812
5 0.72 0.72 0.72 1768
6 0.78 0.75 0.76 1832
7 0.75 0.81 0.78 1808
8 0.78 0.68 0.73 1812
9 0.78 0.70 0.74 1804
accuracy 0.76 18000
macro avg 0.76 0.76 0.76 18000
weighted avg 0.76 0.76 0.76 18000
<Axes: >
Final Observations:
Classification Report
Confusion Matrix
import h5py
# Open the file as read only
h5f = h5py.File('SVHN_single_grey1.h5', 'r')
X_train = h5f['X_train'][:]
y_train = h5f['y_train'][:]
X_test = h5f['X_test'][:]
y_test = h5f['y_test'][:]
h5f.close()
Check the number of images in the training and the testing dataset.
len(X_train), len(X_test)
(42000, 18000)
Observation:
# Printing the shape and pixel array for the first image in the training dataset
print('Training Array Shape:', X_train[0].shape)
print("First image pixel array:\n", X_train[0])
Training Array Shape: (32, 32) First image pixel array: [[ 33.0704 30.2601 26.852 ... 71.4471 58.2204 42.9939] [ 25.2283 25.5533 29.9765 ... 113.0209 103.3639 84.2949] [ 26.2775 22.6137 40.4763 ... 113.3028 121.775 115.4228] ... [ 28.5502 36.212 45.0801 ... 24.1359 25.0927 26.0603] [ 38.4352 26.4733 23.2717 ... 28.1094 29.4683 30.0661] [ 50.2984 26.0773 24.0389 ... 49.6682 50.853 53.0377]]
Reshape the dataset to be able to pass them to CNNs. Remember that we always have to give a 4D array as input to CNNs
# Reshaping the 2D image datasets to 4D to pass them to CNNs
X_train = X_train.reshape(X_train.shape[0], 32, 32, 1)
X_test = X_test.reshape(X_test.shape[0], 32, 32, 1)
Normalize inputs from 0-255 to 0-1
# Normalize inputs from 0-255 to 0-1
X_train_normalized = X_train.astype('float32') / 255.0
X_test_normalized = X_test.astype('float32') / 255.0
Print New shape of Training and Test
print('Training set shape:', X_train_normalized.shape, y_train.shape)
print('Test set shape:', X_test_normalized.shape, y_test.shape)
Training set shape: (42000, 32, 32, 1) (42000,) Test set shape: (18000, 32, 32, 1) (18000,)
# one-hot encoded representation of target labels
y_train_encoded = to_categorical(y_train)
y_test_encoded = to_categorical(y_test)
y_test_encoded
array([[0., 1., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 1., 0., 0.],
[0., 0., 1., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 1., 0., 0.],
[0., 0., 0., ..., 0., 0., 1.],
[0., 0., 1., ..., 0., 0., 0.]], dtype=float32)
Observation: each target variable is a one-hot encoded vector instead of single digit label
Now that we have done data preprocessing, let's build a CNN model. Fix the seed for random number generators
np.random.seed(23)
import random
random.seed(23)
tf.random.set_seed(23)
# First CNN Model Defintion
def cnn_model_1():
# Intializing a sequential model
model = Sequential()
# First convolutional layer with 16 filters and kernel size 3x3 , padding 'same' provides the output size same as the input size
model.add(Conv2D(16, (3, 3), padding="same", input_shape=(32, 32, 1), kernel_regularizer=regularizers.l2(0.01)))
# Adding LeakyReLU layer to the model
model.add(LeakyReLU(0.1))
# Second convolutional layer with 32 filters and kernel size 3x3 , 'same' padding
model.add(Conv2D(32, (3, 3), padding="same", kernel_regularizer=regularizers.l2(0.01)))
# Adding another LeakyReLU layer to the model
model.add(LeakyReLU(0.1))
# Adding max pooling to reduce the size of output of conv layer
model.add(MaxPooling2D(pool_size=(2, 2)))
# Flattening the output of the convolutional layer after max pooling to make it ready for creating dense connections
model.add(Flatten())
# Adding a fully connected dense layer with 32 neurons
model.add(Dense(32))
# Adding another LeakyReLU layer to the model
model.add(LeakyReLU(0.1))
# Adding the output layer with 10 neurons and activation function as softmax since this is a multi-class classification problem
model.add(Dense(10, activation='softmax'))
# Using Adam Optimizer
opt = tf.compat.v1.train.AdamOptimizer(learning_rate=0.001)
# Compile model
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
return model
cnn_model_1 = cnn_model_1()
cnn_model_1.summary()
cnn_history_1 = cnn_model_1.fit(
X_train_normalized, y_train_encoded,
validation_split=0.2,
batch_size=32,
verbose=1,
epochs=20
)
Epoch 1/20 1050/1050 [==============================] - 7s 7ms/step - loss: 1.4272 - accuracy: 0.5555 - val_loss: 0.8335 - val_accuracy: 0.7926 Epoch 2/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.7299 - accuracy: 0.8215 - val_loss: 0.7186 - val_accuracy: 0.8273 Epoch 3/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.6256 - accuracy: 0.8476 - val_loss: 0.6266 - val_accuracy: 0.8527 Epoch 4/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.5699 - accuracy: 0.8624 - val_loss: 0.5986 - val_accuracy: 0.8570 Epoch 5/20 1050/1050 [==============================] - 7s 7ms/step - loss: 0.5300 - accuracy: 0.8715 - val_loss: 0.6018 - val_accuracy: 0.8563 Epoch 6/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.5013 - accuracy: 0.8774 - val_loss: 0.5728 - val_accuracy: 0.8631 Epoch 7/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.4761 - accuracy: 0.8867 - val_loss: 0.5656 - val_accuracy: 0.8639 Epoch 8/20 1050/1050 [==============================] - 7s 7ms/step - loss: 0.4548 - accuracy: 0.8893 - val_loss: 0.5489 - val_accuracy: 0.8676 Epoch 9/20 1050/1050 [==============================] - 7s 7ms/step - loss: 0.4410 - accuracy: 0.8935 - val_loss: 0.5332 - val_accuracy: 0.8749 Epoch 10/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.4201 - accuracy: 0.9001 - val_loss: 0.5317 - val_accuracy: 0.8704 Epoch 11/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.4118 - accuracy: 0.9012 - val_loss: 0.5470 - val_accuracy: 0.8696 Epoch 12/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.3944 - accuracy: 0.9056 - val_loss: 0.5495 - val_accuracy: 0.8694 Epoch 13/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.3876 - accuracy: 0.9066 - val_loss: 0.5607 - val_accuracy: 0.8693 Epoch 14/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.3780 - accuracy: 0.9086 - val_loss: 0.5637 - val_accuracy: 0.8606 Epoch 15/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.3612 - accuracy: 0.9140 - val_loss: 0.5414 - val_accuracy: 0.8723 Epoch 16/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.3560 - accuracy: 0.9162 - val_loss: 0.5411 - val_accuracy: 0.8736 Epoch 17/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.3414 - accuracy: 0.9182 - val_loss: 0.5534 - val_accuracy: 0.8720 Epoch 18/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.3378 - accuracy: 0.9204 - val_loss: 0.5676 - val_accuracy: 0.8610 Epoch 19/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.3291 - accuracy: 0.9224 - val_loss: 0.5334 - val_accuracy: 0.8764 Epoch 20/20 1050/1050 [==============================] - 7s 6ms/step - loss: 0.3185 - accuracy: 0.9269 - val_loss: 0.5421 - val_accuracy: 0.8760
plt.plot(cnn_history_1.history['accuracy'])
plt.plot(cnn_history_1.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
Model Observations: The model is overfitting on the train data which shows accuracy gains with each epoch iteration while the validation split accuracy nearly peaks after only three epochs. Further improvements may be possible with some tuning:
accuracy = cnn_model_1.evaluate(X_test_normalized, y_test_encoded, verbose=2)
563/563 - 1s - loss: 0.5523 - accuracy: 0.8707 - 964ms/epoch - 2ms/step
Let's build another model and see if we can get a better model with generalized performance.
First, we need to clear the previous model's history from the Keras backend. Also, let's fix the seed again after clearing the backend.
# Clearing the backend
from tensorflow.keras import backend
backend.clear_session()
# Fixing seed for second CNN model
np.random.seed(23)
import random
random.seed(23)
tf.random.set_seed(23)
# Second CNN Model Defintion
def cnn_model_2():
# Intializing a sequential model
model = Sequential()
# First convolutional layer with 16 filters and kernel size 3x3 , padding 'same' with input_shape = (32, 32, 1)
model.add(Conv2D(16, (3, 3), padding="same", input_shape=(32, 32, 1)))
# Adding the LeakyReLU layer to the model
model.add(LeakyReLU(0.1))
# Second convolutional layer with 32 filters and kernel size 3x3 , 'same' padding
model.add(Conv2D(32, (3, 3), padding="same"))
model.add(LeakyReLU(0.1))
# Adding max pooling to reduce the size of output of convolutional layer and batch normalization
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
# Third convolutional layer with 32 filters and kernel size 3x3 , 'same' padding
model.add(Conv2D(32, (3, 3), padding="same"))
model.add(LeakyReLU(0.1))
# Fourth convolutional layer with 64 filters and kernel size 3x3 , 'same' padding
model.add(Conv2D(64, (3, 3), padding="same"))
model.add(LeakyReLU(0.1))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
# Flattening the output of the convolutional layer after max pooling to make it ready for creating dense connections
model.add(Flatten())
# Fully connected dense layer with 32 neurons
model.add(Dense(32))
model.add(ReLU(0.1))
# Adding dropout layer
model.add(Dropout(0.5))
# Adding the output layer with 10 neurons and activation function as softmax since this is a multi-class classification problem
model.add(Dense(10, activation='softmax'))
# Using Adam Optimizer
opt = tf.compat.v1.train.AdamOptimizer(learning_rate=0.001)
# opt = SGD(learning_rate=0.01, momentum=0.9)
# Compile model
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
return model
cnn_model_2 = cnn_model_2()
cnn_model_2.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 32, 32, 16) 160
leaky_re_lu (LeakyReLU) (None, 32, 32, 16) 0
conv2d_1 (Conv2D) (None, 32, 32, 32) 4640
leaky_re_lu_1 (LeakyReLU) (None, 32, 32, 32) 0
max_pooling2d (MaxPooling2 (None, 16, 16, 32) 0
D)
batch_normalization (Batch (None, 16, 16, 32) 128
Normalization)
conv2d_2 (Conv2D) (None, 16, 16, 32) 9248
leaky_re_lu_2 (LeakyReLU) (None, 16, 16, 32) 0
conv2d_3 (Conv2D) (None, 16, 16, 64) 18496
leaky_re_lu_3 (LeakyReLU) (None, 16, 16, 64) 0
max_pooling2d_1 (MaxPoolin (None, 8, 8, 64) 0
g2D)
batch_normalization_1 (Bat (None, 8, 8, 64) 256
chNormalization)
flatten (Flatten) (None, 4096) 0
dense (Dense) (None, 32) 131104
re_lu (ReLU) (None, 32) 0
dropout (Dropout) (None, 32) 0
dense_1 (Dense) (None, 10) 330
=================================================================
Total params: 164362 (642.04 KB)
Trainable params: 164170 (641.29 KB)
Non-trainable params: 192 (768.00 Byte)
_________________________________________________________________
cnn_history_2 = cnn_model_2.fit(
X_train_normalized, y_train_encoded,
validation_split=0.2,
batch_size=128,
epochs=30,
verbose=1
)
Epoch 1/30 263/263 [==============================] - 11s 40ms/step - loss: 2.3062 - accuracy: 0.1013 - val_loss: 2.3039 - val_accuracy: 0.0999 Epoch 2/30 263/263 [==============================] - 10s 39ms/step - loss: 2.3067 - accuracy: 0.0992 - val_loss: 2.3037 - val_accuracy: 0.0968 Epoch 3/30 263/263 [==============================] - 10s 39ms/step - loss: 2.3056 - accuracy: 0.0992 - val_loss: 2.3039 - val_accuracy: 0.0994 Epoch 4/30 263/263 [==============================] - 11s 40ms/step - loss: 2.3047 - accuracy: 0.1013 - val_loss: 2.3028 - val_accuracy: 0.1042 Epoch 5/30 263/263 [==============================] - 10s 40ms/step - loss: 2.3041 - accuracy: 0.1039 - val_loss: 2.3028 - val_accuracy: 0.1056 Epoch 6/30 263/263 [==============================] - 11s 41ms/step - loss: 2.3039 - accuracy: 0.1006 - val_loss: 2.3026 - val_accuracy: 0.1023 Epoch 7/30 263/263 [==============================] - 11s 41ms/step - loss: 2.3037 - accuracy: 0.1026 - val_loss: 2.3029 - val_accuracy: 0.1010 Epoch 8/30 263/263 [==============================] - 11s 40ms/step - loss: 2.3035 - accuracy: 0.1012 - val_loss: 2.3033 - val_accuracy: 0.1020 Epoch 9/30 263/263 [==============================] - 11s 40ms/step - loss: 2.3024 - accuracy: 0.1029 - val_loss: 2.3035 - val_accuracy: 0.1008 Epoch 10/30 263/263 [==============================] - 10s 40ms/step - loss: 2.3017 - accuracy: 0.1086 - val_loss: 2.3017 - val_accuracy: 0.1057 Epoch 11/30 263/263 [==============================] - 11s 40ms/step - loss: 2.3025 - accuracy: 0.1066 - val_loss: 2.3018 - val_accuracy: 0.1076 Epoch 12/30 263/263 [==============================] - 11s 41ms/step - loss: 2.3022 - accuracy: 0.1053 - val_loss: 2.3032 - val_accuracy: 0.0946 Epoch 13/30 263/263 [==============================] - 11s 41ms/step - loss: 2.3025 - accuracy: 0.1031 - val_loss: 2.3024 - val_accuracy: 0.1008 Epoch 14/30 263/263 [==============================] - 11s 41ms/step - loss: 2.3024 - accuracy: 0.1044 - val_loss: 2.3021 - val_accuracy: 0.1038 Epoch 15/30 263/263 [==============================] - 11s 41ms/step - loss: 2.3028 - accuracy: 0.1046 - val_loss: 2.3013 - val_accuracy: 0.1057 Epoch 16/30 263/263 [==============================] - 11s 42ms/step - loss: 2.3012 - accuracy: 0.1084 - val_loss: 2.3008 - val_accuracy: 0.1062 Epoch 17/30 263/263 [==============================] - 11s 43ms/step - loss: 2.3009 - accuracy: 0.1087 - val_loss: 2.3005 - val_accuracy: 0.1052 Epoch 18/30 263/263 [==============================] - 11s 42ms/step - loss: 2.3009 - accuracy: 0.1071 - val_loss: 2.3015 - val_accuracy: 0.1082 Epoch 19/30 263/263 [==============================] - 11s 43ms/step - loss: 2.3017 - accuracy: 0.1060 - val_loss: 2.3010 - val_accuracy: 0.1031 Epoch 20/30 263/263 [==============================] - 11s 43ms/step - loss: 2.3001 - accuracy: 0.1097 - val_loss: 2.2993 - val_accuracy: 0.1085 Epoch 21/30 263/263 [==============================] - 11s 43ms/step - loss: 2.2988 - accuracy: 0.1168 - val_loss: 2.2973 - val_accuracy: 0.1207 Epoch 22/30 263/263 [==============================] - 12s 44ms/step - loss: 2.2978 - accuracy: 0.1149 - val_loss: 2.3035 - val_accuracy: 0.1031 Epoch 23/30 263/263 [==============================] - 12s 44ms/step - loss: 2.2965 - accuracy: 0.1138 - val_loss: 2.2927 - val_accuracy: 0.1290 Epoch 24/30 263/263 [==============================] - 11s 43ms/step - loss: 2.2929 - accuracy: 0.1246 - val_loss: 2.2912 - val_accuracy: 0.1395 Epoch 25/30 263/263 [==============================] - 12s 44ms/step - loss: 2.2902 - accuracy: 0.1279 - val_loss: 2.2864 - val_accuracy: 0.1420 Epoch 26/30 263/263 [==============================] - 12s 44ms/step - loss: 2.2834 - accuracy: 0.1416 - val_loss: 2.2915 - val_accuracy: 0.1324 Epoch 27/30 263/263 [==============================] - 11s 44ms/step - loss: 2.2729 - accuracy: 0.1552 - val_loss: 2.2865 - val_accuracy: 0.1548 Epoch 28/30 263/263 [==============================] - 12s 44ms/step - loss: 2.2415 - accuracy: 0.1887 - val_loss: 2.2996 - val_accuracy: 0.0992 Epoch 29/30 263/263 [==============================] - 12s 45ms/step - loss: 2.1780 - accuracy: 0.2356 - val_loss: 2.2040 - val_accuracy: 0.1932 Epoch 30/30 263/263 [==============================] - 12s 44ms/step - loss: 2.0948 - accuracy: 0.2870 - val_loss: 2.1040 - val_accuracy: 0.2914
from matplotlib import legend
plt.plot(cnn_history_2.history['accuracy'])
plt.plot(cnn_history_2.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
Observations:
y_pred_cnn = cnn_model_2.predict(X_test_normalized)
563/563 [==============================] - 3s 5ms/step
Note: Earlier, we noticed that each entry of the target variable is a one-hot encoded vector, but to print the classification report and confusion matrix, we must convert each entry of y_test to a single label.
# Converting each entry to single label from one-hot encoded vector
# Obtaining the categorical values from y_test_encoded and y_pred_cnn
y_test_arg=np.argmax(y_test_encoded,axis=1)
y_pred_arg=np.argmax(y_pred_cnn,axis=1)
from sklearn.metrics import classification_report
# Classification report for true and predicted values
print(classification_report(y_test_arg, y_pred_arg))
# Plotting the Confusion Matrix using confusion matrix() from the tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg, y_pred_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
confusion_matrix,
annot=True,
linewidths=.4,
fmt="d",
square=True,
ax=ax
)
precision recall f1-score support
0 0.16 0.13 0.15 1814
1 0.21 0.91 0.34 1828
2 0.58 0.19 0.29 1803
3 0.33 0.01 0.02 1719
4 0.29 0.34 0.32 1812
5 0.46 0.49 0.48 1768
6 0.46 0.14 0.21 1832
7 0.62 0.33 0.43 1808
8 0.32 0.02 0.04 1812
9 0.22 0.27 0.24 1804
accuracy 0.29 18000
macro avg 0.37 0.28 0.25 18000
weighted avg 0.37 0.29 0.25 18000
<Axes: >
Final Observations:
accuracy = cnn_model_2.evaluate(X_test_normalized, y_test_encoded, verbose=2)
563/563 - 3s - loss: 2.1041 - accuracy: 0.2861 - 3s/epoch - 5ms/step