Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 39 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,44 @@
**University of Pennsylvania, CIS 565: GPU Programming and Architecture,
Project 1 - Flocking**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Akshay Shah
* Tested on: Windows 10, i7-5700HQ @ 2.70GHz 16GB, GTX 970M 6GB (Personal Computer)

### (TODO: Your README)
### Screenshots

Include screenshots, analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)
Here is an implementation of the coherent grid boid simulation with 5000 and 120,000 boids respectively:

![](images/perf_128_std.gif)
5000 boids

![](images/perf_256_std.gif)
120,000 boids

### Analysis

* For each implementation, how does changing the number of boids affect performance? Why do you think this is?

A: Increasing the boids decreases the FPS as there are more boids to look at to include them to change the velocity.
Here is an image comparing the different implementations with increasing number of boids:

![](images/128vs256boidsvsfpsnaivevscoherentvsuniform.png)


* For each implementation, how does changing the block count and block size affect performance? Why do you think this is?

A: Increasing the blocksize slightly increases the FPS over time over the same number of boids. This is maybe due to many blocks per function call
![](images/128vs256boidsvsfpsnaive.png)

* For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?

A: The average FPS for coherent grid was slower over lesser number of boids but stays at a stable rate of 45fps for 80,000 boids whereas the uniform grid reduces down over time to 30fps for the same number of boids. Look at the following graph for an example:

![](images/128vs256boidsvsfpsnaivevscoherentvsuniform.png)

Following is an analysis of the average time spent in a function:
![](images/perf_analysis.png)
Notice how the average time spent is 94% in updating velocities.
This is a naive implementation of the simulation.

![](images/perf_analysis_20k_std.png)
Notice the average time spent in the update velocity is 20% and is staggered over the update position and compute start and end cell grid indices.
Binary file added images/128vs256boidsvsfpsnaive.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/perf_128_std.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/perf_256_std.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/perf_analysis.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/perf_analysis_20k_std.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,5 @@ set(SOURCE_FILES

cuda_add_library(src
${SOURCE_FILES}
OPTIONS -arch=sm_20
OPTIONS -arch=sm_52
)
Loading