Fast, reproducible, and portable software development environments
-
Updated
Dec 8, 2021 - Dockerfile
Fast, reproducible, and portable software development environments
Remote development on HPC clusters with VSCode
Matrix multiplication example performed with OpenMP, OpenACC, BLAS, cuBLABS, and CUDA
Accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques.
A simple and understandable CUDA kernel for batch-matmul operation
Repository for Architecture of computers and parallel systems course on VŠB
University Project for "Computer Architecture" course (MSc Computer Engineering @ University of Pisa). Implementation of a Parallelized Nearest Neighbor Upscaler using CUDA.
K-Means written from scratch in CUDA
Add a description, image, and links to the nsight topic page so that developers can more easily learn about it.
To associate your repository with the nsight topic, visit your repo's landing page and select "manage topics."