The recent pb (probability of boundary) boundary detection algorithm significantly outperforms classical methods such as Canny and Sobel methods by considering texture and color discontinuities in addition to intensity discontinuities. We will implement a simplified version of the pb method: pb-lite method!
This package was built using Python 3.7, Pytorch and OpenCV on Ubuntu 20.04. Follow the instructions on this page to setup Pytorch and this page to setup OpenCV for Ubuntu. Other packages include matplotlib
, scipy
and scikitlearn
. These are relatively easy to install using pip install *package_name*
.
Download the package:
git clone git@github.com:latent-pixel/Alohomora.git
Then, from the package's root directory, use the following command to run it:
python3 phase1/code/Wrapper.py
The results can then be found in a separate results
folder in the package.
Filter bank is a collection of different filters: Oriented Difference of Gaussian (DOG), Leung-Malik (LM), and Gabor filters. This filter bank helps us capture the texture properties of an image.
DOG Filters | Leung-Malik Filters | Gabor Filters |
---|---|---|
Texture Map | Brightness Map | Color Map |
---|---|---|
Texture Gradients | Brightness Gradients | Color Gradients |
---|---|---|
Ouput of the Pb-lite algorithm
compared with Canny
and Sobel
baselines:
Canny Baseline | Sobel Baseline | Pb-lite Ouput |
---|---|---|
In this section, various (important) neural network architectures are implemented and tested. We also compare them on various criteria such as the number of parameters, train-test accuracies and determine why one architecture works better than the other.
Download the data required for the package (CIFAR10 dataset) from here. Since the package has already been downloaded - from the package's root directory, use the following commands for training/inference respectively:
python phase2/code/train.py --ModelArch=<ModelArchitecture> --DataPath=<DataDirectory> --CheckPointPath=<YourCheckPointDir> --LogsPath=<YourLogsDir> --NumEpochs=<Epochs> --DivTrain=<int> --MiniBatchSize=<int> --LoadCheckPoint=<bool>
python phase2/code/test.py --ModelArch=<ModelArchitecture> --DataPath=<DataDirectory> --ModelPath=<ModelPath>
Use --help
argument for further explanation on the arguments.
To start out, we build a basic CNN inspired from LeNet by Yann LeCun. We then move on to the more recent seminal works in the field, such as ResNet, ResNeXt and DenseNet. This is a brief summary of the results obtained.
All the models were trained using the same train-validation-test split, data augmentation and training regime, details of which are given below:
- Dataset: CIFAR10+ (random flipping, and cropping from 40x40 padded image)
- Train-Validation-Test Split: 45000-5000-10000
- Optimizer: SGD with momentum of 0.9 and learning rate of 0.1
- Mini-batch size: 32 (varies in accordance with the "hybrid method" proposed in this paper)
- Epochs: 30
Train-Test Accuracies |
---|
Results for ResNet-18, ResNeXt-29 and DenseNet-40 |
Validation Loss | Results |
---|---|
Although significant changes have been made, the code for Phase II of this project was based on / inspired by the starter code provided in WPI's RBE549 (Computer Vision). This is the link to their project page.