A collection of deep learning architectures (a PyTorch implementation).
- python 3.6
- PyTorch 0.4
- torchvision
- visdom
If you have more nvidia graphic cards & cores available, adjust the batch size (BSZ), number of GPUs (gpus), & number of threads (cpus) accordingly in the ./ImageNet.sh. Next, select an available architecture and update your train & validation folder location (trainDataPath and testDataPath). Finally, run ./ImageNet.sh.
To replicate Hinton's paper on MNIST, run the following:
python Capsule.py -A capsule -B 256 -E 500 --optimizer adam --gpus 2 --cpus 6 --trainDataPath ./data --testDataPath ./data --replicate_paper
Ignore the replicate_paper argument to create a deep architecture (with few residual blocks before primary capsule). You can essentially add any block available in NeuralLayers to create a deeper architecture, which is followed by a primary capsule and secondary capsule. However, do consider two things 1. if you do reconstruction, update the reconstruction network relative to tensor_size, 2. capsule nets do require a good amount of gpu ram.
Trained on CIFAR10
- ResidualNet -- use type = r18/r34/r50/r101/r152 or rn50/rn101/rn152 for ResNeXt or ser50/ser101/ser152 for Squeeze-and-Excitation Networks or sern50/sern101/sern152 (ResNeXt + Squeeze-and-Excitation Networks)
- InceptionV4
- MobileNetV1
- MobileNetV2
- ShuffleNet (type = g1, g2, g3, g4, g8 from table 1)
- CapsuleNet (Hinton's version, and an example deep network)
- LinearVAE
- ConvolutionalVAE
- SimpleNet
- BaseModel -- A base class that contains networks (embedding, loss or any), meters (loss, accuracy etc), fileName, isCUDA
- CudaModel -- Converts any model (pytorch module) to run on single gpu or multiple gpu's or cpu
- LoadModel -- Loads pretrained models (usually, from ./models)
- SaveModel -- Save models (usually, state_dict of anything that starts with net in BaseModel, and rest as is)
- MakeModel -- Builds model using base class
- MakeCNN -- Creates a CNN (netEmbedding) and loss layer (netLoss)
- MakeAE -- Creates an auto-encoder/vae in netAE
- FolderITTR -- PyTorch image folder iterator with few extras.
- MNIST -- MNIST train and test dataset loader.
- CIFAR10 -- CIFAR10 train and test dataset loader.
-
Convolution -- A convolution layer with following parameters:
- tensor_size = a list/tuple of length 4 (BxWxHxC - any B should work)
- filter_size = int/list/tuple (if list/tuple, length must be 2)
- out_channels = int
- strides = int/list/tuple (if list/tuple, length must be 2)
- pad = True/False (True essentially delivers same output size when strides = 1, and False returns valid convolution)
- activation = relu/relu6/lklu(leaky relu)/tanh/sigm/maxo/swish
- dropout = 0. to 1. (adds dropout layer)
- batch_nm = True/False (adds batch normalization when True)
- pre_nm = True/False (when True along with batch_nm -- batch normalization + activation + convolution else convolution + batch normalization + activation)
- groups = 1 (default)
- weight_nm = True/False
-
ConvolutionTranspose -- A convolution transpose layer with parameters same as Convolution layer
-
CarryResidue -- Has several layers that requires residual connections or concatenation
- ResidualOriginal
- ResidualComplex
- ResidualComplex2
- ResidualNeXt
- SEResidualComplex
- SEResidualNeXt
- ResidualInverted
- ResidualShuffle
- SimpleFire
- Stem2
- InceptionA
- InceptionB
- InceptionC
- ReductionA
- ReductionB
-
DetailPooling -- Use asymmetric and lite to switch between different implementations.
- asymmetric - True (equation 6)/False (equation 5)
- lite - True (trainable weights - full-DPP) / False (linearly downscale)
-
LossFunctions
- CapsuleLoss
- CategoricalLoss -- Cross entropy / softmax / taylor softmax / large margin cosine loss