Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
include		include
src		src
tests		tests
CMakeLists.txt		CMakeLists.txt
README.md		README.md
plot_results.py		plot_results.py
run_all.sh		run_all.sh

Repository files navigation

Quantum Based Cuda Kernel

This project demonstrates advanced CUDA kernels for deep learning workloads and provides:

High-performance GEMM (classical and Tensor Core)
Mixed precision (FP16) support
Fused GEMM+ReLU kernel
Quantum-inspired search kernel
Auto-tuning logic
CUDA Graphs for reduced launch overhead
Comprehensive unit tests (GoogleTest)
Performance logging and plotting (CSV and matplotlib)

Requirements

NVIDIA GPU with CUDA support (Compute Capability >= 7.0 for Tensor Cores)
CUDA Toolkit installed
CMake >= 3.10
Python 3 and matplotlib (pip install matplotlib)

Usage

./run_all.sh

About

No description, website, or topics provided.

Report repository

Releases

No releases published

Packages

No packages published

Languages