Skip to content

aminekhelif/QCudaKernel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quantum Based Cuda Kernel

This project demonstrates advanced CUDA kernels for deep learning workloads and provides:

  • High-performance GEMM (classical and Tensor Core)
  • Mixed precision (FP16) support
  • Fused GEMM+ReLU kernel
  • Quantum-inspired search kernel
  • Auto-tuning logic
  • CUDA Graphs for reduced launch overhead
  • Comprehensive unit tests (GoogleTest)
  • Performance logging and plotting (CSV and matplotlib)

Requirements

  • NVIDIA GPU with CUDA support (Compute Capability >= 7.0 for Tensor Cores)
  • CUDA Toolkit installed
  • CMake >= 3.10
  • Python 3 and matplotlib (pip install matplotlib)

Usage

./run_all.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published