GitHub

Purpose

Implementing the GPU accelerated Algorithms and Optimizations through CUDA.

1. Tiled Matrix Multiplication

This implementation computes the product of two matrices $C = A \times B$ using a tiled approach.

Key Optimizations :

-> Shared Memory Tiling: Instead of fetching elements directly from Global Memory for every calculation, threads load blocks (tiles) of data into shared memory. This reduces the global memory bandwidth bottleneck by a factor of the TILE_SIZE.

-> Memory Coalescing: Data is loaded into shared memory using a pattern where threads in a warp access contiguous memory addresses, ensuring the hardware can combine multiple requests into a single transaction.

-> Bank Conflict Avoidance: The TILE_SIZE is aligned with the GPU warp size (32), ensuring smooth access to shared memory banks.

2. Numerical Integration

This implementation calculates the definite integral of a function using the Trapezoidal Rule.

Key Optimizations:

-> Grid-Stride Loops: Rather than assuming one thread per interval, threads loop over the range. This makes the code robust to any input size $N$ and keeps the GPU execution units busy without overhead.

-> Two-Stage Reduction: To avoid "Atomic Contention," threads first sum their values into registers, then perform a tree-based reduction in Shared Memory to find the block sum.

-> Atomic Aggregation: Only the final thread of each block performs an atomicAdd to global memory, minimizing the bottleneck on the final result variable.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
MatADD		MatADD
MatMul		MatMul
Numerical_Integration		Numerical_Integration
VectorADD		VectorADD
helloWorld		helloWorld
README.md		README.md
query_device.cu		query_device.cu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Purpose

1. Tiled Matrix Multiplication

2. Numerical Integration

About

Uh oh!

Releases

Packages

Languages

vylakshya/CUDA_based_Optimizations

Folders and files

Latest commit

History

Repository files navigation

Purpose

1. Tiled Matrix Multiplication

2. Numerical Integration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages