GitHub - pjyi2147/GPU_Computing_project: Parallelize Quick Hull algorithm for Convex Hull problem using CUDA and compare the performance between the serial (CPU) version of the algorithm.

CSED 490C Heterogeneous Computing - Final Project

Fall 2023, POSTECH

Details

This project is to parallelize Quick Hull algorithm for Convex Hull problem using CUDA and compare the performance between the serial (CPU) version of the algorithm.

How to run

The project is built using Nvidia GPU teaching kit.

CUDA-enabled GPU, CUDA Toolkit and CMake (CCMake) must be installed to run the project.

Build libgputk The following procedure will build libgputk (the support library) that will be linked with your template file for processing command-line arguments, logging time, and checking the correctness of your solution.

Create the target build directory

mkdir build-dir
cd build-dir

We will use ccmake

ccmake ..

You will see the following screen

Pressing c would configure the build to your system (in the process detecting the compiler, the CUDA Toolkit location, etc...).

BUILD_LIBgpuTK_LIBRARY          *ON
BUILD_LOGTIME                   *ON

If you have modified the above, then you should type g to regenerate the Makefile and then q to quit out of ccmake. You can then use the make command to build the labs.

Build the data generator and template

The following will compile the template file, and the data generator that will generate input files.

cd sources 
make template 
make daataset_generator

You can generate input data with

~~ ./dataset_generator ~~

This will create a directory that contains multiple pairs of input data. You can modify the file to generate input data of different sizes.

Change the file name in Makefile to change the template file version to the version that you want to build.

Version changelogs

v1:

initial version

v2:

changed pt_idx structure
Does not use pt_idx pointer vectors anymore (free from deep copy!!)

v3:

Apply threshold to apply CUDA kernel
Best perf at 32 * BLOCKSIZE

v4:

Apply parallel reduction on the max dist calculation
Best perf at 32 * BLOCKSIZE (85ms on 2M points)

v5:

Apply Unified memory
Does not work well

v6:

Sort using thrust
Works really well

Benchmark results

Optimization flag includes -O3 -Xptxas -O3.

References

Computing the convex hull on GPU by Timur Iskhakov

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
libgputk		libgputk
sources		sources
support		support
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CSED 490C Heterogeneous Computing - Final Project

Fall 2023, POSTECH

Details

How to run

Version changelogs

Benchmark results

References

About

Uh oh!

Uh oh!

Languages

pjyi2147/GPU_Computing_project

Folders and files

Latest commit

History

Repository files navigation

CSED 490C Heterogeneous Computing - Final Project

Fall 2023, POSTECH

Details

How to run

Version changelogs

Benchmark results

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages