Distributed deep learning with RAPIDS

This is a small repo with a demo of distributed training of a random forest using NVIDIA RAPIDS.

It includes some example job submission scripts for different scheduling systems principally targeted at ARC4 and Bede.

Usage

First create the conda environment (you will need to install miniconda if you haven't already)

$ conda env create -f environment.yml

Then check the submission script you wish to use to ensure it activates the correct conda environment with the name rapids using a line such as:

source activate rapids

Then submit your job using the appropriate job submission command:

# for SGE systems
$ qsub submit-sgeRapids.sh

# for slurm systems
$ sbatch submit-slurmRapids.sh

- Actually do some distributed deep learning. Following on the ideas in this page from Keras docs.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
model		model
rapids		rapids
.gitignore		.gitignore
README.md		README.md
environment-bede.yml		environment-bede.yml
environment.yml		environment.yml
submit-sge.sh		submit-sge.sh
submit-sgeRapids.sh		submit-sgeRapids.sh
submit-slurm.sh		submit-slurm.sh
submit-slurmRapids.sh		submit-slurmRapids.sh