Skip to content

Comparison of CNN and Pre Trained Model when training data is very small.

Notifications You must be signed in to change notification settings

J22160/Deep-Learning-for-small-training-size

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 

Repository files navigation

Comparison of CNN and Pre Trained Model.

It is well known that deep learning is data hungry and requires a lot of training data to generalise well. In this project, we will compare 2 types of deep learning models and see which one performs better when training size is very small.

Data

The data consists of MRI scans of the brain. The task is to identify whether the MRI has brain tumour or not. It is a classification problem which is solved using deep learning. The images can be downloaded from here. We have a total of only 260 images of which 155 images have brain tumor, named as 'YES', and 105 images do not have brain tumor which are named as 'NO'.

Splitting the data

The data was spilt into Training and Validation Set. The training set has 220 images and validation set has 40 images. The ratio of 'YES' and 'NO' images is almost equal in both the sets (1.472 in training and 1.5 in validation).

Models used

1. CNN Model

It is a sequential model having 2 Convolutional Blocks and a Fully Connected Layer.

Screenshot from 2019-11-12 10-53-54

Stochastic Gradient Descent with momentum is used as the optimizer with a very small learning rate and the model is trained for 50 epochs so as to avoid overfitting.

2. Pre Trained Model

I use VGG16 as the pre-trained model with weights trained on the ImageNet dataset. The fully connected layers at the top of VGG16 are omitted. Custom Fully connected layers are added after the convolutional layers of the base model.

vgg16

The training is done in 2 steps. In Step 1, all the convolutional layers of VGG16 are freezed and only the custom fully connected layers at the top are trained. A small learning rate is used to avoid making drastic updates to the weights. The model is trained for 25 epochs with a SGD optimizer with momentum.

Screenshot from 2019-11-10 09-54-47

In Step 2, all the convolutional layers except the last convolutional block of VGG16 are freezed. The last conv layer and the fully connected layers are trained with the same optimizer as in step 1. The model is again trained for 25 epochs. So the total number of epochs for the transfer learning model is 50 which is equal to the number of epochs in CNN model.

Why a 2-step process?

This is because our training images are very different from the ImageNet dataset on which VGG16 is trained. To make sure the our model learns the features present in the training images, the last convolutional block is unfreezed in step 2. Further this 2-step process gives a better accuracy than training step 1 for 50 epochs.

Result

The code for training both the models is given in Brain Tumor Detection.ipynb. I ran the code for 10 different sets of validation data. All the sets were different from each other and union of all the validation sets gives the entire training set of 260 images.

Screenshot (10)

Accuracy and loss graphs for the models are present in graphs.odt.

Conclusion

Clearly the Pre Trained Model outperforms the CNN Model in 9 out of the 10 sets and also has a higher mean accuracy. Fine tuning the pre trained model by using a 2 step process also contributes in a higher validation accuracy.

References

About

Comparison of CNN and Pre Trained Model when training data is very small.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published