It is well known that deep learning is data hungry and requires a lot of training data to generalise well. In this project, we will compare 2 types of deep learning models and see which one performs better when training size is very small.
The data consists of MRI scans of the brain. The task is to identify whether the MRI has brain tumour or not. It is a classification problem which is solved using deep learning. The images can be downloaded from here. We have a total of only 260 images of which 155 images have brain tumor, named as 'YES', and 105 images do not have brain tumor which are named as 'NO'.
The data was spilt into Training and Validation Set. The training set has 220 images and validation set has 40 images. The ratio of 'YES' and 'NO' images is almost equal in both the sets (1.472 in training and 1.5 in validation).
1. CNN Model
It is a sequential model having 2 Convolutional Blocks and a Fully Connected Layer.
Stochastic Gradient Descent with momentum is used as the optimizer with a very small learning rate and the model is trained for 50 epochs so as to avoid overfitting.
2. Pre Trained Model
I use VGG16 as the pre-trained model with weights trained on the ImageNet dataset. The fully connected layers at the top of VGG16 are omitted. Custom Fully connected layers are added after the convolutional layers of the base model.
The training is done in 2 steps. In Step 1, all the convolutional layers of VGG16 are freezed and only the custom fully connected layers at the top are trained. A small learning rate is used to avoid making drastic updates to the weights. The model is trained for 25 epochs with a SGD optimizer with momentum.
In Step 2, all the convolutional layers except the last convolutional block of VGG16 are freezed. The last conv layer and the fully connected layers are trained with the same optimizer as in step 1. The model is again trained for 25 epochs. So the total number of epochs for the transfer learning model is 50 which is equal to the number of epochs in CNN model.
This is because our training images are very different from the ImageNet dataset on which VGG16 is trained. To make sure the our model learns the features present in the training images, the last convolutional block is unfreezed in step 2. Further this 2-step process gives a better accuracy than training step 1 for 50 epochs.
The code for training both the models is given in Brain Tumor Detection.ipynb
. I ran the code for 10 different sets of validation data. All the sets were different from each other and union of all the validation sets gives the entire training set of 260 images.
Accuracy and loss graphs for the models are present in graphs.odt
.
Clearly the Pre Trained Model outperforms the CNN Model in 9 out of the 10 sets and also has a higher mean accuracy. Fine tuning the pre trained model by using a 2 step process also contributes in a higher validation accuracy.