Skip to content

Latest commit

 

History

History
128 lines (94 loc) · 3.87 KB

README.md

File metadata and controls

128 lines (94 loc) · 3.87 KB

DTC-SpMM: Accelerating General Sparse Matrix Multiplication with Tensor Cores

This project contains the codes for DTC-SpMM, a recent work aimed at enhancing the performance of general-purpose Sparse Matrix-Matrix Multiplication (SpMM) on GPUs equipped with Tensor Cores. This work has been accepted for presentation at ASPLOS'24.

  • If you find this work useful, please cite this project and our paper.

    @inproceedings{fan2024dtc,
      title={DTC-SpMM: Bridging the Gap in Accelerating General Sparse Matrix Multiplication with Tensor Cores},
      author={Fan, Ruibo and Wang, Wei and Chu, Xiaowen},
      booktitle={Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3},
      pages={253--267},
      year={2024}
    }

1. Prepare your environment

# RTX 4090 (prefer) or 3090 with CUDA 12.1 installed.
export PATH=/usr/local/cuda-12.1/bin:$PATH
export CUDA_HOME="/usr/local/cuda-12.1/"
export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64:$LD_LIBRARY_PATH

# Create and activate virtual env
conda create -n DTCSpMM python=3.9
conda activate DTCSpMM

# install PyTorch (must)
conda install pytorch pytorch-cuda=12.1 -c pytorch -c nvidia

# install cmake, numpy and scipy (must)
conda install cmake
pip install numpy
pip install scipy

# install cugraph (Optional for TCA-reordering)
pip install cugraph-cu12 --extra-index-url=https://pypi.nvidia.com

# install datasketch (Optional for TCA-reordering)
pip install datasketch

# install cupy (Optional for TCA-reordering)
pip install cupy-cuda12x

# install cudf (Optional for TCA-reordering)
pip install --extra-index-url=https://pypi.nvidia.com cudf-cu12

2. Clone DTC-SpMM

git clone --recursive git@github.com:fan1997/DTC-SpMM-ASPLOS24.git
cd DTC-SpMM-ASPLOS24 && source init_dtc.sh

3. Prepare Sputnik (dependency)

cd third_party/
source ./build_sputnik.sh

4. Build DTC-SpMM

cd ${DTC_HOME}/DTC-SpMM && source build.sh

5. Download datasets

git lfs clone https://github.com/fan1997/dtc_datasets.git
cd dtc_datasets
tar -zxvf reordered_matrices.tar.gz
tar -zxvf origin_matrices.tar.gz

6. Run tests

# Run DTCSpMM
cd ${DTC_HOME}/scripts/DTCSpMM 
# modify the dataset path in run_DTC_SpMM.py to your own path.
source run_DTC_SpMM.sh

# Run cuSPARSE
cd ${DTC_HOME}/scripts/cusparse 
# modify the dataset path in run_cuSPARSE.py to your own path.
source run_cuSPARSE_SpMM.sh

# Run Sputnik
cd ${DTC_HOME}/scripts/Sputnik 
# modify the dataset path in run_Sputnik.py to your own path.
source run_Sputnik.sh

# Run SparseTIR, you need to install SparseTIR (https://sampl.cs.washington.edu/SparseTIR/install.html)
cd ${DTC_HOME}/scripts/SparseTIR 
# modify the dataset path in run_sparsetir.py to your own path.
source run_SparseTIR.sh

# Run TCGNN-SpMM
cd ${DTC_HOME}/scripts/TCGNN 
# modify the dataset path in run_TCGNN_SpMM.py to your own path.
source run_TCGNN_SpMM.sh

7. Use TCA-reordering

cd TCA-reordering

# install minhashcuda
git clone https://github.com/src-d/minhashcuda.git
mv minhashcuda/* ./ && rm -r minhashcuda
cmake -DCMAKE_BUILD_TYPE=Release . && make && python setup.py install

# Run an example to reorder reddit dataset
python TCA_reorder.py --dataset reddit --thres 16

Related work