Artifact for SC21: APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores.
-
Updated
Aug 26, 2021 - Cuda
Artifact for SC21: APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores.
Compare the different runtime of CNN computation on CPU and GPU
Experiments to accelerate GPU device for PyTorch training
Fast SGEMM emulation on Tensor Cores
simple examples of tools and libraries
An extension library of WMMA API for single precision matrix operation using TensorCores and error correction technique
Fast Kernel SVM on TensorCore enabled GPU
FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme
Artifact for PPoPP22 QGTC: Accelerating Quantized GNN via GPU Tensor Core.
An extension library of WMMA API (Tensor Core API)
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
Add a description, image, and links to the tensorcore topic page so that developers can more easily learn about it.
To associate your repository with the tensorcore topic, visit your repo's landing page and select "manage topics."