This container contains an environment to run and compile SYCL 2020 codes on NVIDIA and AMD GPUs and any CPU with two SYCL compilers:
- Documentation for GPU Support
singularity pull docker://ghcr.io/maison-de-la-simulation/sycl-complete
#optional: Mount directory
export SINGULARITY_BIND="/path/to/local/dir:/path/to/container/dir"
#Run container in shell
singularity shell --env OMP_NUM_THREADS=64 --env OMP_PLACES=cores sycl-complete.sif
singularity shell --nv sycl-complete.sif
#Or execute command inside container
singularity exec --nv sycl-complete.sif acpp-info
singularity exec --rocm sycl-complete.sif sycl-ls
docker pull ghcr.io/maison-de-la-simulation/sycl-complete
#Run container in interactive mode
docker run --it ghcr.io/maison-de-la-simulation/sycl-complete
..
Compilers are installed in /opt/sycl
. The environment is setup in the $PATH
so just type in sycl-ls
(DPC++) or acpp-info
(AdaptiveCpp) to list the SYCL devices. The associated runtimes are installed so once your code is compiled you should be able to run it within the container.
Following are examples to compile a SYCL code with the two different SYCL compilers. We assume the code is mounted inside /mnt/program
.
- Documentation for using acpp.
With AdaptiveCpp, set the ACPP_TARGETS
before using the syclcc
compiler.
#We build for a NVIDIA A100, AMD MI250x, and GPUs
export ACPP_TARGETS="omp;cuda:sm_80;hip:gfx90a"
cd /mnt/program/build #go into build folder
#use /opt/sycl/acpp/syclcc as CXX compiler
cmake -DCMAKE_CXX_COMPILER=syclcc ..
make -j 64
- Intel DPC++ Users manual
With this compiler, use the -fsycl
and -fsycl-targets
compilation flags to use ahead of time compilation. We use the cmake M_FLAGS
variable to pass these parameters to the clang++
compiler.
#for NVIDIA A100
cmake \
-DCMAKE_CXX_COMPILER=clang++ \
-DM_FLAGS="-fsycl -fsycl-targets=nvidia_gpu_sm_80" \
..
#for Intel CPU with avx512 capabilities
clang++ -fsycl -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=avx512" myprog.cpp myprog.o
Versions of the backend used inside the container:
- NVIDIA GPUs: CUDA 11.8 via NVIDIA base container
- AMD GPUs: ROCm 5.5.1
- OpenMP CPUs (used for AdaptiveCpp CPUs compilation and runtime): LLVM 17.0.0
- OpenCL Devices (used for DPC++ CPUs runtime): OpenCL via oneAPI DPC++ Get Started Guide
Tools:
- CMake 3.27
- vim 8.1
- git 2.25.1
- wget, unzip, python, nano...
-
clang++-17
(llvm) andclang++
(Intel llvm, the SYCL compiler) are not the same -
On some singularity configuration:
clang++
cannot write temporary file during the compilation phase. You need to set theTMPDIR
pointing to a directory inside the writable container -
On some singularity configuration you need to mount
/sys /dev /proc
inside the container (SINGUALRITY_BIND="/sys,/dev,/proc"
) to be able to see the GPUs devices