This repository contains open source code for the bachelor thesis 'GPU-Berechnung von genetischen Verwandtschaftsmatrizen durch Warp-Level-Matrixmultiplikation' submitted on 27th of June 2022.
- R 4.0 or higher
- gcc 11.0 or higher
- CUDA 11.6 or higher
- NVIDIA GPU of compute capability 8.0 or higher (Ampere architecture or newer)
- A local copy of CUTLASS (
- R CMD INSTALL RandomFieldsUtils --configure-args="USE_GPU=yes USE_AVX=yes"
- R CMD INSTALL miraculix --configure-args="CXX_FLAGS='-mavx2 -DGPU_DEV' USE_GPU='yes'"
- miraculix contains all functions for the calculation of the genomic relationship matrix. GPU code can be found in files starting with the 'mma' prefix.
- R files in the main folder provide benchmarks, use-cases and syntax hints.
The custom CUDA kernel does not work properly yet.
Explanation and example of tensor core usage and warp-level-gemm
For an overview of the functionality provided by cutlass
Template for 1-bit matrix multiplication (XOR)
Template for custom kernel
"Did you write the GPU implementation yet?" – "Not one bit!"