Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 1: Ishan Ranade #20

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 32 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,37 @@
**University of Pennsylvania, CIS 565: GPU Programming and Architecture,
Project 1 - Flocking**

* (TODO) YOUR NAME HERE
* (TODO) [LinkedIn](), [personal website](), [twitter](), etc.
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Ishan Ranade
* Tested on personal computer: Gigabyte Aero 14, Windows 10, i7-7700HQ, GTX 1060

### (TODO: Your README)
# Boid Simulation

Include screenshots, analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)
![](demo2.gif)

## Features

- Naive implementation
- Uniform grid without coherence
- Uniform grid with coherence
- Performance analysis with various blocksizes and boid counts

## Performance Analysis

Below are graphs for performance for the three algorithms implements for various blocksizes including 128, 256, and 512.

![](blocksize128.JPG)

![](blocksize256.JPG)

![](blocksize512.JPG)


To perform my analysis I recorded down the framerates of the simulation as both the blocksize and number of boids changed for each of the three types of algorithm implemented. I felt that FPS was the best way to determine which combination produced the best results.

The baseline was the Naive method, and the goal was for each of the other two methods to be able to beat the naive performance. This would be a good test to determine if using the GPU actually produced better results.

For the Naive method changing the number of boids definitely changed the performance because this method was completely dependendent on boid size and did not make use of the massive parallelism. For the uniform method, increasing the number of boids made the performance worse in all three cases of blocksizes. This may be because of the lack of caching when looping through the particles in the surrounding cells. The coherent method did the best with increasing boid count and this may be due to caching.

In general it seemed that the naive method produced the worst results especially as the number of boids increased. The block size did not affect this method as it did not make use of the massive parallelism.

The uniform method seemed to do better than the coherent method except for when the blocksize was 512. The coherent grid showed the best performance especially as the boid size increased. I was surprised though that at low blocksizes and boid counts it seemed to do worse than the uniform method, but seemed to excel as both the blocksize and boid count increased.
Binary file added blocksize128.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blocksize256.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blocksize512.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
41 changes: 41 additions & 0 deletions data.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
Coherent
Boids 1000
Blocksize 128 - 450
Blocksize 256 - 469
Blocksize 512 - 564
Boids 5000
Blocksize 128 - 398
Blocksize 256 - 395
Blocksize 512 - 396
Boids 10000
Blocksize 128 - 485
Blocksize 256 - 378
Blocksize 512 - 320

Uniform
Boids 1000
Blocksize 128 - 555
Blocksize 256 - 553
Blocksize 512 - 496
Boids 5000
Blocksize 128 - 382
Blocksize 256 - 404
Blocksize 512 - 392
Boids 10000
Blocksize 128 - 444
Blocksize 256 - 277
Blocksize 512 - 242

Naive
Boids 1000
Blocksize 128 - 460
Blocksize 256 - 509
Blocksize 512 - 464
Boids 5000
Blocksize 128 - 253
Blocksize 256 - 240
Blocksize 512 - 242
Boids 10000
Blocksize 128 - 121
Blocksize 256 - 117
Blocksize 512 - 114
Binary file added data.xlsx
Binary file not shown.
Binary file added demo.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,5 @@ set(SOURCE_FILES

cuda_add_library(src
${SOURCE_FILES}
OPTIONS -arch=sm_20
OPTIONS -arch=sm_60
)
Loading