Skip to content

This repository contains the code and the report for the coursework of INFR11031 Advanced Vision, a postgraduate course offered at The University of Edinburgh. The task was to train on limited and improve the accuracy of the ResNet-50 classifier on a small subset of the ImageNet dataset containing 50K training images and 50K test images. Achieve…

Notifications You must be signed in to change notification settings

NamanMakkar/UoE-INFR11031-Advanced-Vision-CW-21-22

Repository files navigation

UoE-INFR11031-Advanced-Vision-CW-21-22

This repository contains the code and the report for the coursework of INFR11031 Advanced Vision, a postgraduate course offered at The University of Edinburgh. The task was to train on limited and improve the accuracy of the ResNet-50 classifier on a small subset of the ImageNet dataset containing 50K training images and 50K test images.

In the field of deep learning there is a huge dependence on large scale datasets for purposes of model pretraining.Learning with limited data is a challenging task. For example, the ResNet-50 convolutional neural network achieves a Top1-Accuracy of 76.3% on the original ImageNet dataset which contains 1 million training images, 50,000 validation images and 100,000 test images. In contrast, using standard training techniques and cross entropy loss, the best Top-1 Accuracy achieved on the minimized Imagenet dataset is 45% only. In order to reduce dependency on large datasets, there is a need to come up with a combination of training methods that could effectively improve CNN performance on limited data. The ResNet-50 backbone is utilised for carrying out image classification on a subset of the Imagenet dataset. Ablation studies were carried out using various data augmentation techniques (such as RandAugment and AutoAugment in addition to CutMix and Mixup), loss functions such as label smoothing cross entropy and soft target cross entropy , activation functions like Mish, Swish and GELU in addition to adversarial training techniques along with a slight improvement in the model architecture with the addition of 3 Spatial Pyramid Pooling layers in order to improve Top-1 Accuracy on the subset to 40.8%.

About

This repository contains the code and the report for the coursework of INFR11031 Advanced Vision, a postgraduate course offered at The University of Edinburgh. The task was to train on limited and improve the accuracy of the ResNet-50 classifier on a small subset of the ImageNet dataset containing 50K training images and 50K test images. Achieve…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published