CIFAR 10 Horovod Example

This example uses the Deep Layer Aggregation method to train on the CIFAR10 dataset.

Installation with Pipenv

Install OpenMPI if you wish to be able to run a distributed workload locally.
Install Pipenv which is a dependency management tool with a locking mechanism (similar to Anaconda).

Clone this repository and run:

export HOROVOD_WITH_PYTORCH=1
export HOROVOD_WITH_MPI=1
export HOROVOD_WITHOUT_GLOO=1

# If GPU
# export HOROVOD_CUDA_HOME=/usr/local/cuda
# export HOROVOD_GPU=CUDA
pipenv install

This command creates a virtualenv based on the Pipfile and Pipfile.lock.

Usage

With Docker

Prepare the directories:

mkdir -p "$(pwd)/data"
# Download CIFAR-10 dataset
curl -fsSL https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz -o "$(pwd)/data/cifar-10-python.tar.gz"
tar -C $(pwd)/data/ -xvzf "$(pwd)/data/cifar-10-python.tar.gz"
mkdir -p "$(pwd)/checkpoint"

Run the model:

docker run \
  --rm \
  -v "$(pwd)/data:/data" \
  -v "$(pwd)/checkpoint:/checkpoint" \
  -u 1000:1000 \
  --entrypoint /bin/sh \
  ghcr.io/deepsquare-io/cifar-10-example:latest \
  -c '\
  mpirun \
  -np 4 \
  /.venv/bin/python3 \
  /app/main.py \
  --no-cuda \
  --horovod \
  --checkpoint_in=/checkpoint/ckpt.pth \
  --checkpoint_out=/checkpoint/ckpt.pth \
  --dataset=/data
'

With Pipenv

Prepare the directories:

mkdir -p "$(pwd)/data"
# Download CIFAR-10 dataset
curl -fsSL https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz -o "$(pwd)/data/cifar-10-python.tar.gz"
tar -C $(pwd)/data/ -xvzf "$(pwd)/data/cifar-10-python.tar.gz"
mkdir -p "$(pwd)/checkpoint"

Run the model:

pipenv shell
mpirun \
  -np 4 \
  python3 \
  main.py \
  --no-cuda \
  --horovod \
  --checkpoint_in="$(pwd)/checkpoint/ckpt.pth" \
  --checkpoint_out="$(pwd)/checkpoint/ckpt.pth" \
  --dataset="$(pwd)/data"
'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CIFAR 10 Horovod Example

Installation with Pipenv

Usage

With Docker

With Pipenv

Files

README.md

Latest commit

History

README.md

File metadata and controls

CIFAR 10 Horovod Example

Installation with Pipenv

Usage

With Docker

With Pipenv