This is a mini-repository for running a ResNet101 model on CIFAR10 dataset using distributed training. Link to the main article can be found here.
- Linux (only tested on Linux)
- PyTorch
- NVIDIA GPU and CuDNN
-
Clone this repository:
git clone https://github.com/naga-karthik/ddp-resnet-cifar cd ddp-resnet-cifar
-
Download the necessary packages:
pip install requirements.txt
-
If you will be running it on a remote server, then it is probably better to pre-download the dataset than actually doing it on-the-fly.
-
Create a folder named "data" and move the downloaded dataset into the folder.
From the terminal use the following commands to run the model.
- With default settings:
python mainCIFAR10.py
- With other options:
python mainCIFAR10.py --n_epochs=100 --lr=0.001 --batch_size=32