Comparison of CNN and Pre Trained Model.

It is well known that deep learning is data hungry and requires a lot of training data to generalise well. In this project, we will compare 2 types of deep learning models and see which one performs better when training size is very small.

Data

The data consists of MRI scans of the brain. The task is to identify whether the MRI has brain tumour or not. It is a classification problem which is solved using deep learning. The images can be downloaded from here. We have a total of only 260 images of which 155 images have brain tumor, named as 'YES', and 105 images do not have brain tumor which are named as 'NO'.

Splitting the data

The data was spilt into Training and Validation Set. The training set has 220 images and validation set has 40 images. The ratio of 'YES' and 'NO' images is almost equal in both the sets (1.472 in training and 1.5 in validation).

Models used

1. CNN Model

It is a sequential model having 2 Convolutional Blocks and a Fully Connected Layer.

Stochastic Gradient Descent with momentum is used as the optimizer with a very small learning rate and the model is trained for 50 epochs so as to avoid overfitting.

2. Pre Trained Model

I use VGG16 as the pre-trained model with weights trained on the ImageNet dataset. The fully connected layers at the top of VGG16 are omitted. Custom Fully connected layers are added after the convolutional layers of the base model.

The training is done in 2 steps. In Step 1, all the convolutional layers of VGG16 are freezed and only the custom fully connected layers at the top are trained. A small learning rate is used to avoid making drastic updates to the weights. The model is trained for 25 epochs with a SGD optimizer with momentum.

In Step 2, all the convolutional layers except the last convolutional block of VGG16 are freezed. The last conv layer and the fully connected layers are trained with the same optimizer as in step 1. The model is again trained for 25 epochs. So the total number of epochs for the transfer learning model is 50 which is equal to the number of epochs in CNN model.

Why a 2-step process?

This is because our training images are very different from the ImageNet dataset on which VGG16 is trained. To make sure the our model learns the features present in the training images, the last convolutional block is unfreezed in step 2. Further this 2-step process gives a better accuracy than training step 1 for 50 epochs.

Result

The code for training both the models is given in Brain Tumor Detection.ipynb. I ran the code for 10 different sets of validation data. All the sets were different from each other and union of all the validation sets gives the entire training set of 260 images.

Accuracy and loss graphs for the models are present in graphs.odt.

Conclusion

Clearly the Pre Trained Model outperforms the CNN Model in 9 out of the 10 sets and also has a higher mean accuracy. Fine tuning the pre trained model by using a 2 step process also contributes in a higher validation accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
Brain Tumor Detection.ipynb		Brain Tumor Detection.ipynb
README.md		README.md
graphs.odt		graphs.odt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comparison of CNN and Pre Trained Model.

Data

Splitting the data

Models used

Why a 2-step process?

Result

Conclusion

References

About

Releases

Packages

Languages

J22160/Deep-Learning-for-small-training-size

Folders and files

Latest commit

History

Repository files navigation

Comparison of CNN and Pre Trained Model.

Data

Splitting the data

Models used

Why a 2-step process?

Result

Conclusion

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages