Custom Build Your Convolutional Neural Network (CNN) from Scratch in TensorFlow

8 min readJun 4, 2021

Photo by United Nations COVID-19 Response on Unsplash

Is there another way to identify COVID cases other than the traditional medical testing procedures such as nasal swab testing etc.? Probably yes but I just don’t know enough about them as I am not a doctor or a nurse or PhD in Medicine. So what else can be done?

Hmmm, is it possible to detect patterns from the CT scans of the patients who were tested positive for COVID? People get CT scans all the time for variety of other medical reasons. Can the images from the CT scans be used to identify whether or not the patient has COVID?

Well, this is an interesting question worth exploring. At least, for me.

That said, in this article, I will write about my recent attempt to build Convolutional Neural Network to classify COVID cases using the images of CT scans. First of all, I didn’t get a 99% accuracy! At least, not just yet! So, don’t keep your hopes too high for now. The intent of this article is to provide a step by step guide to implement CNN from scratch using TensorFlow low-level API. The motivation for this article stemmed from my prior struggle to find similar instructions online when I first attempted to build deep neural networks using the TensorFlow’s low-level API and thought I would create one for those who are in the same shoes.

Data

I am going to use publicly available data from Kaggle and it could be found here.

Data contains labeled images for COVID positive and negative cases. There are 397 images for COVID negative and 349 images for COVID positive cases. Data appears to be pretty balanced in this case.

Approach

Create Training Data
Explore and Visualize
Determine Data Dimensions and Network Architecture
Construct the Network
Compile and Train the Model
Model Evaluation

Step 1: Create Training Data

I will use OpenCV, which is an open-source library for computer vision and image processing, to create my training data. We will read the images into Python list first, separate features and labels, reshape the data as required by TensorFlow, convert the data into NumPy array and save it as pickle files. Below is the code that does just that and more detail could be found in my GitHub here.

# Import libraries
import pickle
import numpy as np
import matplotlib.pyplot as plt
import os
import cv2# Set up directory path and the directory names
datadir = "C:\\....\\...\\....\\...." 
directory = ["covid_negative", "covid_positive"]# Function that creates the training data
training_data=[]
img_size=100
def create_training_data():
    for folder in directory:
        path = os.path.join(datadir, folder)
        label = directory.index(folder)
        for img in os.listdir(path):
            try:
                img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_GRAYSCALE)
                new_array = cv2.resize(img_array, (img_size, img_size))
                training_data.append([new_array, label])
            except Exception as e:
                pass
# Create the training data using above custom function
create_training_data()# Seperate features and labels into separate arrays
X = []
y = []
for features, label in training_data:
    X.append(features)
    y.append(label)# reshape as required by Tensorlow/Keras
X = np.array(X).reshape(-1,img_size, img_size, 1)
y = np.array(y).reshape(-1,1)# Now save the data
pickle_out = open("X.pickle", "wb")
pickle.dump(X, pickle_out)
pickle_out.close()pickle_out = open("y.pickle", "wb")
pickle.dump(y, pickle_out)
pickle_out.close()

Having done this, now you should have the training data exported in the required format for the next steps.

Step 2: Explore and Visualize

Now that we have the data in pickle files, we can easily read the data into Jupyter notebook and visualize. The images are 100 x 100 and here is how they look like.

_ = plt.imshow(X[2].reshape(100, 100), cmap='gray')
plt.show()

CT scan of lung — kinda looks disgusting, doesn’t it?

Step 3: Determine Data Dimensions and Network Architecture

Let’s determine dimensions for the data and the architecture for the neural network. I will first start with a brute-force type architecture which is not the most optimal and obviously will not give you the best accuracy. But it is a good starting point from where it could be further optimized. Please note, the image resolution, the filter sizes, pooling window size and strides could be initialized at whatever number you feel like trying. I’d be very happy to get your secret sauce in getting these numbers right to achieve the best model performance.

# We know that the images are 100 pixels by 100 pixels
img_size = 100# Images are stored in one-dimensional arrays of this length
img_size_flat = img_size*img_size# Height and width of each image in Tuple
img_shape = (img_size, img_size)# Number of color channels for the image: 1 channel for gray-scale
num_channels = 1# Number of classes - two classes, covid positive or covid negative
num_classes = 2# Convolutional layer 1
filter1_size = 3  #Convolution filters are 3 x 3 pixels
num_filters1 = 16 #There are 16 of these filters

# Convolutional layer 2
filter2_size = 3 #Convolution filters are 3 x 3 pixels
num_filters2 = 32 #There are 32 of these filters

# Pooling
window_size = 2 #Pooling window 2x2
window_stride = 2 #Move by 2 strides

# Fully-connected layer
fc_size=1024     # Number of nodes in the fully-connected layer

# Convolution stride
conv_stride=1

Hyperparameters of the above network architecture are as follows. These are not trainable variables but could be tuned through various hyperparameter tuning techniques to improve the model performance.

# Convolutional layer 1
filter1_size = 3  #Convolution filters are 3 x 3 pixels
num_filters1 = 16 #There are 16 of these filters

# Convolutional layer 2
filter2_size = 3 #Convolution filters are 3 x 3 pixels
num_filters2 = 32 #There are 32 of these filters

# Pooling
window_size = 2 #Pooling window 2x2
window_stride = 2 #Move by 2 strides

# Fully-connected layer
fc_size=1024     # Number of nodes in the fully-connected layer

# Convolution stride
conv_stride=1

Step 4: Construct the Network

To build the above architecture and the model, I will write separate helper function for each of the following.

Trainable Variables (Weights)
Convolutional Layer 1
Convolutional Layer 2
Flat Layer
Fully Connected and Output layers

Helper-function to create weights

def weights(shape):
    weights = tf.Variable(tf.random.normal(shape=shape, stddev=0.05))
    return weights

Helper-function to create Convolutional Layer 1

def ConvNet1(image):        
   conv1 = tf.nn.conv2d(input=image, filters=conv1_weights,        strides=conv_stride, padding='SAME')     
   conv1+=bias_1     
   conv1 = tf.nn.relu(conv1)     
   conv1 = tf.nn.max_pool(input=conv1, ksize=window_size, strides=window_stride, padding='SAME')     
   return conv1

Helper-function to create Convolutional Layer 2

def ConvNet2(conv1):
    conv2 = tf.nn.conv2d(input=conv1, filters=conv2_weights, strides=conv_stride, padding='SAME')
    conv2+=bias_2
    conv2 = tf.nn.relu(conv2)
    conv2 = tf.nn.max_pool(input=conv2, ksize=window_size, strides=window_stride, padding='SAME')
    return conv2

Helper-function to Flatten the Layer

def flatten_layer(conv2):
    layer_shape = conv2.get_shape()
    num_features = layer_shape[1:4].num_elements()
    flat_layer = tf.reshape(conv2, [-1, num_features])
    return flat_layer, num_features

Helper-function for Fully Connected Layer

def make_prediction(flat_layer, fc_weights, bias_fc, w_out, b_out):
    #Fully connected layer
    fc_product = tf.matmul(flat_layer, fc_weights)
    fully_connected = tf.keras.activations.relu(fc_product+bias_fc)
    fc_drop = tf.nn.dropout(fully_connected, rate=0.7, seed=1)
    
    #Output layer
    output = tf.matmul(fc_drop, w_out)
    prediction = tf.keras.activations.softmax(output + b_out)
    return prediction

Helper-function to wrap everything into a trainable model

def model(image):
    conv1 = ConvNet1(image)
    conv2 = ConvNet2(conv1)
    flat_layer, num_features = flatten_layer(conv2)
    predictions = make_prediction(flat_layer, fc_weights, bias_fc, w_out, b_out)
    return predictions

Step 5: Compile and Train the Model

Now that we have all of the helper-functions set up, we are ready to compile and train the model. Let’s start by packaging the data into batches and use context manager to train the model. During the training process, the trainable variables will be updated through optimization. We will use Adam optimizer to optimize the loss function and update the weights/variables. The trainable variables are as follows:

Weights for convolutional layers
Biases for convolutional layers
Weights for fully connected and output layers
Biases for fully connected and output layers

# Optimizer
optimizer = tf.keras.optimizers.Adam()# Instantiate a loss function
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)# Prepare the metrics.
train_acc_metric = tf.keras.metrics.SparseCategoricalAccuracy()
test_acc_metric = tf.keras.metrics.SparseCategoricalAccuracy()# Package the data into batches
batch_size=16
train_ds = tf.data.Dataset.from_tensor_slices((X_train, y_train)).shuffle(1024).batch(batch_size)
test_ds = tf.data.Dataset.from_tensor_slices((X_test, y_test)).shuffle(1024).batch(batch_size)import time

start = time.time()
epochs = 100for epoch in range(epochs):
    print("\nStart of epoch %d" % (epoch,))

    # Iterate over the batches of the dataset.
    for step, (x_batch_train, y_batch_train) in enumerate(train_ds):

        # Open a GradientTape to record the operations run
        # during the forward pass, which enables auto-differentiation.
        with tf.GradientTape() as tape:

            # Run the forward pass of the layer.
            # The operations that the layer applies
            # to its inputs are going to be recorded
            # on the GradientTape.
            logits = model(x_batch_train)  

            # Compute the loss value for this minibatch.
            loss_value = loss_fn(y_batch_train, logits)

        # Use the gradient tape to automatically retrieve
        # the gradients of the trainable variables with respect to the loss.
        grads = tape.gradient(loss_value, [conv1_weights, bias_1, conv2_weights, bias_2, fc_weights, bias_fc, w_out, b_out])

        # Run one step of gradient descent by updating
        # the value of the variables to minimize the loss.
        optimizer.apply_gradients(zip(grads, [conv1_weights, bias_1, conv2_weights, bias_2, fc_weights, bias_fc, w_out, b_out]))
        
        # Update training metric.
        train_acc_metric.update_state(y_batch_train, logits)

        # Log every 1874 batches.
        if step%16 == 0 and step!=0:
            print(
                "Training loss (for one batch) at step %d: %.4f"
                % (step, float(loss_value))
            )
            print("Seen so far: %s samples" % ((step + 1) * 32))
            
         # Display metrics at the end of every 1874 batch
        if step%16==0 and step!=0:
            train_acc = train_acc_metric.result()
            print("Training acc: %.4f" % (float(train_acc),))

        # Reset training metrics at the end of each epoch
        train_acc_metric.reset_states()

        # Run a test loop at the end of each epoch.
        for x_batch_test, y_batch_test in test_ds:
            predictions = model(x_batch_test)
            #Update val metrics
            test_acc_metric.update_state(y_batch_test, predictions)
            
        if step%16==0 and step!=0:
            test_acc = test_acc_metric.result()
            print("Test acc: %.4f" % (float(test_acc),))
        test_acc_metric.reset_states()
        
print("Time taken: %.2fs" % (time.time() - start))

Step 6: Model Evaluation

As a result of running 10 epochs, the following are the accuracy and loss graphs for the training and testing sets.

As seen from above, the accuracy for test and training sets are not as high as the one we would expect. That is, there is a huuuge room for improvement where you and I can continue experimenting with various different types of techniques to improve the model performance. In my Github repo, you will find an excel file where I tracked the changes and tuning of various hyperparameters. I’d like to invite those of you who are interested in playing with this model to improve its performance and I’d really appreciate it if you could post your findings in the comment below. Other than that, hopefully you have enjoyed reading this article and happy deep learning!

Code Details

Please find my code for the image processing and CNN in my repo.

References:

Different ways to perform tuning the model performance >>

https://towardsdatascience.com/the-quest-of-higher-accuracy-for-cnn-models-42df5d731faf