diff --git a/README.md b/README.md
index 3b51372..0231d76 100644
--- a/README.md
+++ b/README.md
@@ -3,271 +3,60 @@ CIS 565 Project3 : CUDA Pathtracer
 
 Fall 2014
 
-Due Wed, 10/8 (submit without penalty until Sun, 10/12)
 
-## INTRODUCTION
-In this project, you will implement a CUDA based pathtracer capable of
-generating pathtraced rendered images extremely quickly. Building a pathtracer can be viewed as a generalization of building a raytracer, so for those of you who have taken 460/560, the basic concept should not be very new to you. For those of you that have not taken
-CIS460/560, raytracing is a technique for generating images by tracing rays of
-light through pixels in an image plane out into a scene and following the way
-the rays of light bounce and interact with objects in the scene. More
-information can be found here:
-http://en.wikipedia.org/wiki/Ray_tracing_(graphics). Pathtracing is a generalization of this technique by considering more than just the contribution of direct lighting to a surface.
-
-Since in this class we are concerned with working in generating actual images
-and less so with mundane tasks like file I/O, this project includes basecode
-for loading a scene description file format, described below, and various other
-things that generally make up the render "harness" that takes care of
-everything up to the rendering itself. The core renderer is left for you to
-implement.  Finally, note that while this basecode is meant to serve as a
-strong starting point for a CUDA pathtracer, you are not required to use this
-basecode if you wish, and you may also change any part of the basecode
-specification as you please, so long as the final rendered result is correct.
-
-## CONTENTS
-The Project3 root directory contains the following subdirectories:
-	
-* src/ contains the source code for the project. Both the Windows Visual Studio
-  solution and the OSX and Linux makefiles reference this folder for all 
-  source; the base source code compiles on Linux, OSX and Windows without 
-  modification.  If you are building on OSX, be sure to uncomment lines 4 & 5 of
-  the CMakeLists.txt in order to make sure CMake builds against clang.
-* data/scenes/ contains an example scene description file.
-* renders/ contains an example render of the given example scene file. 
-* windows/ contains a Windows Visual Studio 2010 project and all dependencies
-  needed for building and running on Windows 7. If you would like to create a
-  Visual Studio 2012 or 2013 projects, there are static libraries that you can
-  use for GLFW that are in external/bin/GLFW (Visual Studio 2012 uses msvc110, 
-  and Visual Studio 2013 uses msvc120)
-* external/ contains all the header, static libraries and built binaries for
-  3rd party libraries (i.e. glm, GLEW, GLFW) that we use for windowing and OpenGL
-  extensions
-
-## RUNNING THE CODE
-The main function requires a scene description file (that is provided in data/scenes). 
-The main function reads in the scene file by an argument as such :
-'scene=[sceneFileName]'
-
-If you are using Visual Studio, you can set this in the Debugging > Command Arguments section
-in the Project properties.
-
-## REQUIREMENTS
-In this project, you are given code for:
-
-* Loading, reading, and storing the TAKUAscene scene description format
-* Example functions that can run on both the CPU and GPU for generating random
-  numbers, spherical intersection testing, and surface point sampling on cubes
-* A class for handling image operations and saving images
-* Working code for CUDA-GL interop
-
-You will need to implement the following features:
+## PROJECT DESCRIPTION
+In this project, I implement a CUDA based pathtracer capable of
+generating pathtraced rendered images extremely quickly. 
 
+## FEATURES
+basic features:
 * Raycasting from a camera into a scene through a pixel grid
 * Diffuse surfaces
 * Perfect specular reflective surfaces
-* Cube intersection testing
+* Cube&Sphere intersection testing
 * Sphere surface point sampling
-* Stream compaction optimization
+* Stream compaction optimization 
+* 
 
-You are also required to implement at least 2 of the following features:
-
-* Texture mapping 
-* Bump mapping
+additional features:
 * Depth of field
-* Refraction, i.e. glass
-* OBJ Mesh loading and rendering
-* Interactive camera
-* Motion blur
-* Subsurface scattering
-
-The 'extra features' list is not comprehensive.  If you have a particular feature
-you would like to implement (e.g. acceleration structures, etc.) please contact us 
-first!
-
-For each 'extra feature' you must provide the following analysis :
-* overview write up of the feature
-* performance impact of the feature
-* if you did something to accelerate the feature, why did you do what you did
-* compare your GPU version to a CPU version of this feature (you do NOT need to 
-  implement a CPU version)
-* how can this feature be further optimized (again, not necessary to implement it, but
-  should give a roadmap of how to further optimize and why you believe this is the next
-  step)
+* Fresnel Refraction and Reflection
+* Supersampled antialiasing
 
-## BASE CODE TOUR
-You will be working in three files: raytraceKernel.cu, intersections.h, and
-interactions.h. Within these files, areas that you need to complete are marked
-with a TODO comment. Areas that are useful to and serve as hints for optional
-features are marked with TODO (Optional). Functions that are useful for
-reference are marked with the comment LOOK.
-
-* raytraceKernel.cu contains the core raytracing CUDA kernel. You will need to
-  complete:
-    * cudaRaytraceCore() handles kernel launches and memory management; this
-      function already contains example code for launching kernels,
-      transferring geometry and cameras from the host to the device, and transferring
-      image buffers from the host to the device and back. You will have to complete
-      this function to support passing materials and lights to CUDA.
-    * raycastFromCameraKernel() is a function that you need to implement. This
-      function once correctly implemented should handle camera raycasting. 
-    * raytraceRay() is the core raytracing CUDA kernel; all of your pathtracing
-      logic should be implemented in this CUDA kernel. raytraceRay() should
-      take in a camera, image buffer, geometry, materials, and lights, and should
-      trace a ray through the scene and write the resultant color to a pixel in the
-      image buffer.
-
-* intersections.h contains functions for geometry intersection testing and
-  point generation. You will need to complete:
-    * boxIntersectionTest(), which takes in a box and a ray and performs an
-      intersection test. This function should work in the same way as
-      sphereIntersectionTest().
-    * getRandomPointOnSphere(), which takes in a sphere and returns a random
-      point on the surface of the sphere with an even probability distribution.
-      This function should work in the same way as getRandomPointOnCube(). You can
-      (although do not necessarily have to) use this to generate points on a sphere
-      to use a point lights, or can use this for area lighting.
-
-* interactions.h contains functions for ray-object interactions that define how
-  rays behave upon hitting materials and objects. You will need to complete:
-    * getRandomDirectionInSphere(), which generates a random direction in a
-      sphere with a uniform probability. This function works in a fashion
-      similar to that of calculateRandomDirectionInHemisphere(), which generates a
-      random cosine-weighted direction in a hemisphere.
-    * calculateBSDF(), which takes in an incoming ray, normal, material, and
-      other information, and returns an outgoing ray. You can either implement
-      this function for ray-surface interactions, or you can replace it with your own
-      function(s).
-
-You will also want to familiarize yourself with:
-
-* sceneStructs.h, which contains definitions for how geometry, materials,
-  lights, cameras, and animation frames are stored in the renderer. 
-* utilities.h, which serves as a kitchen-sink of useful functions
-
-## NOTES ON GLM
-This project uses GLM, the GL Math library, for linear algebra. You need to
-know two important points on how GLM is used in this project:
-
-* In this project, indices in GLM vectors (such as vec3, vec4), are accessed
-  via swizzling. So, instead of v[0], v.x is used, and instead of v[1], v.y is
-  used, and so on and so forth.
-* GLM Matrix operations work fine on NVIDIA Fermi cards and later, but
-  pre-Fermi cards do not play nice with GLM matrices. As such, in this project,
-  GLM matrices are replaced with a custom matrix struct, called a cudaMat4, found
-  in cudaMat4.h. A custom function for multiplying glm::vec4s and cudaMat4s is
-  provided as multiplyMV() in intersections.h.
-
-## Scene FORMAT
-This project uses a custom scene description format, called TAKUAscene.
-TAKUAscene files are flat text files that describe all geometry, materials,
-lights, cameras, render settings, and animation frames inside of the scene.
-Items in the format are delimited by new lines, and comments can be added at
-the end of each line preceded with a double-slash.
-
-Materials are defined in the following fashion:
+## RUNNING THE CODE
+This project is tested in Visual Studio 2010 with CUDA 6.5.
+The main function requires a scene description file (that is provided in data/scenes). 
+You can change the scene file in command arguments section.
 
-* MATERIAL (material ID)								//material header
-* RGB (float r) (float g) (float b)					//diffuse color
-* SPECX (float specx)									//specular exponent
-* SPECRGB (float r) (float g) (float b)				//specular color
-* REFL (bool refl)									//reflectivity flag, 0 for
-  no, 1 for yes
-* REFR (bool refr)									//refractivity flag, 0 for
-  no, 1 for yes
-* REFRIOR (float ior)									//index of refraction
-  for Fresnel effects
-* SCATTER (float scatter)								//scatter flag, 0 for
-  no, 1 for yes
-* ABSCOEFF (float r) (float b) (float g)				//absorption
-  coefficient for scattering
-* RSCTCOEFF (float rsctcoeff)							//reduced scattering
-  coefficient
-* EMITTANCE (float emittance)							//the emittance of the
-  material. Anything >0 makes the material a light source.
 
-Cameras are defined in the following fashion:
+## IMPLEMENTATION
+* Color Accumulation
+* 
 
-* CAMERA 												//camera header
-* RES (float x) (float y)								//resolution
-* FOVY (float fovy)										//vertical field of
-  view half-angle. the horizonal angle is calculated from this and the
-  reslution
-* ITERATIONS (float interations)							//how many
-  iterations to refine the image, only relevant for supersampled antialiasing,
-  depth of field, area lights, and other distributed raytracing applications
-* FILE (string filename)									//file to output
-  render to upon completion
-* frame (frame number)									//start of a frame
-* EYE (float x) (float y) (float z)						//camera's position in
-  worldspace
-* VIEW (float x) (float y) (float z)						//camera's view
-  direction
-* UP (float x) (float y) (float z)						//camera's up vector
+I have touble accumulate color when I was testing my diffuse surface, here is what I got at first diffuse rendering:
+![alt tag](https://raw.githubusercontent.com/XJMa/Project3-Pathtracer/master/screenShoots/bug.jpg)
+Later I found out the color contribution of each iteration should be different, then I divide the color with 1/iteration and got much more reasonable result: 
+![alt tag](https://raw.githubusercontent.com/XJMa/Project3-Pathtracer/master/screenShoots/diffuse.jpg)
+* Stream Compaction
 
-Objects are defined in the following fashion:
-* OBJECT (object ID)										//object header
-* (cube OR sphere OR mesh)								//type of object, can
-  be either "cube", "sphere", or "mesh". Note that cubes and spheres are unit
-  sized and centered at the origin.
-* material (material ID)									//material to
-  assign this object
-* frame (frame number)									//start of a frame
-* TRANS (float transx) (float transy) (float transz)		//translation
-* ROTAT (float rotationx) (float rotationy) (float rotationz)		//rotation
-* SCALE (float scalex) (float scaley) (float scalez)		//scale
+This path tracer is parallelized by ray. When parallelizing by pixel, some ray paths die out before the maximum ray depth is reached (either from absorption or lack of intersection), and we can get rid of these "dead" rays using stream compaction
 
-An example TAKUAscene file setting up two frames inside of a Cornell Box can be
-found in the scenes/ directory.
+This implementation uses a pool of rays, each ray has a flag indicating if the ray is alive or dead. With each new wave of raycasts, the current active rays are pulled from the pool and cast into the scene. Depending on whether or not geometry was intersected, the ray is marked as inactive or active. After each wave of raycasts, I use thrust scan and scatter funstion to cull dead rays and get a new raypool with smaller size. 
 
-For meshes, note that the base code will only read in .obj files. For more 
-information on the .obj specification see http://en.wikipedia.org/wiki/Wavefront_.obj_file.
+* Fresnel Transparency
 
-An example of a mesh object is as follows:
+I implement fresnel transparency to make the glass material looks more realistic. The implementation is basically the same as I did in CIS560 ray tracer project: first calculate the fresnel coefficients and decide if the ray is a reflect ray or refract ray by the random number portion.
+![alt tag](https://raw.githubusercontent.com/XJMa/Project3-Pathtracer/master/screenShoots/fract.jpg)
 
-OBJECT 0
-mesh tetra.obj
-material 0
-frame 0
-TRANS       0 5 -5
-ROTAT       0 90 0
-SCALE       .01 10 10 
+* Depth of Field
 
-Check the Google group for some sample .obj files of varying complexity.
+Depth of Field is implemneted by selecting a depth plane away from the camera and jittering the camera position during each iteration.
+![alt tag](https://raw.githubusercontent.com/XJMa/Project3-Pathtracer/master/screenShoots/dof.jpg)
 
-## THIRD PARTY CODE POLICY
-* Use of any third-party code must be approved by asking on our Google Group.  
-  If it is approved, all students are welcome to use it.  Generally, we approve 
-  use of third-party code that is not a core part of the project.  For example, 
-  for the ray tracer, we would approve using a third-party library for loading 
-  models, but would not approve copying and pasting a CUDA function for doing 
-  refraction.
-* Third-party code must be credited in README.md.
-* Using third-party code without its approval, including using another
-  student's code, is an academic integrity violation, and will result in you
-  receiving an F for the semester.
 
-## SELF-GRADING
-* On the submission date, email your grade, on a scale of 0 to 100, to Harmony,
-  harmoli+cis565@seas.upenn.com, with a one paragraph explanation.  Be concise and
-  realistic.  Recall that we reserve 30 points as a sanity check to adjust your
-  grade.  Your actual grade will be (0.7 * your grade) + (0.3 * our grade).  We
-  hope to only use this in extreme cases when your grade does not realistically
-  reflect your work - it is either too high or too low.  In most cases, we plan
-  to give you the exact grade you suggest.
-* Projects are not weighted evenly, e.g., Project 0 doesn't count as much as
-  the path tracer.  We will determine the weighting at the end of the semester
-  based on the size of each project.
+## PERFORMANCE ANALYSIS
 
-## SUBMISSION
-Please change the README to reflect the answers to the questions we have posed
-above.  Remember:
-* this is a renderer, so include images that you've made!
-* be sure to back your claims for optimization with numbers and comparisons
-* if you reference any other material, please provide a link to it
-* you wil not e graded on how fast your path tracer runs, but getting close to
-  real-time is always nice
-* if you have a fast GPU renderer, it is good to show case this with a video to
-  show interactivity.  If you do so, please include a link.
+Here is a comparsion of path tracer using stream compaction and not using it. (running time with 10 iteration)
+![alt tag](https://raw.githubusercontent.com/XJMa/Project3-Pathtracer/master/screenShoots/p.jpg)
 
-Be sure to open a pull request and to send Harmony your grade and why you
-believe this is the grade you should get.
+We can see when the max depth is small the un SC version is more efficient because scan and scatter operation overhead our caculation. But when the max depth gets larger, the advantage of stream compaction shows up: by discarding redundant rays we can save a lot of computational costs in ray tracing since majority of rays will terminate after 10 bounce.
diff --git a/data/scenes/sampleScene.txt b/data/scenes/sampleScene.txt
index 6a9f5cc..5593285 100644
--- a/data/scenes/sampleScene.txt
+++ b/data/scenes/sampleScene.txt
@@ -34,11 +34,11 @@ ABSCOEFF    0 0 0
 RSCTCOEFF   0
 EMITTANCE   0
 
-MATERIAL 3 				//red glossy
-RGB         .63 .06 .04      
+MATERIAL 3 				//golden
+RGB         .53 .56 .04      
 SPECEX      0      
 SPECRGB     1 1 1       
-REFL        0       
+REFL        0.5       
 REFR        0        
 REFRIOR     2       
 SCATTER     0        
@@ -50,7 +50,7 @@ MATERIAL 4 				//white glossy
 RGB         1 1 1     
 SPECEX      0      
 SPECRGB     1 1 1      
-REFL        0       
+REFL        0.5       
 REFR        0        
 REFRIOR     2      
 SCATTER     0        
@@ -59,10 +59,10 @@ RSCTCOEFF   0
 EMITTANCE   0
 
 MATERIAL 5 				//glass
-RGB         0 0 0    
+RGB         1 1 1    
 SPECEX      0      
 SPECRGB     1 1 1      
-REFL        0       
+REFL        1       
 REFR        1        
 REFRIOR     2.2       
 SCATTER     0        
@@ -158,7 +158,7 @@ SCALE       .01 10 10
 
 OBJECT 5
 sphere
-material 4
+material 5
 frame 0
 TRANS       0 2 0
 ROTAT       0 180 0
@@ -166,7 +166,7 @@ SCALE       3 3 3
 
 OBJECT 6
 sphere
-material 3
+material 5
 frame 0
 TRANS       2 5 2
 ROTAT       0 180 0
@@ -174,17 +174,24 @@ SCALE       2.5 2.5 2.5
 
 OBJECT 7
 sphere
-material 6
+material 5
 frame 0
 TRANS       -2 5 -2
 ROTAT       0 180 0
 SCALE       3 3 3
 
-
 OBJECT 8
+sphere
+material 3
+frame 0
+TRANS       -2 5 -2
+ROTAT       0 180 0
+SCALE       3 3 3
+
+OBJECT 9
 cube
 material 8 
 frame 0
-TRANS       0 10 0
+TRANS       0 10 -4
 ROTAT       0 0 90
 SCALE       .3 3 3
\ No newline at end of file
diff --git a/data/scenes/sampleScene2.txt b/data/scenes/sampleScene2.txt
new file mode 100644
index 0000000..8555d10
--- /dev/null
+++ b/data/scenes/sampleScene2.txt
@@ -0,0 +1,190 @@
+MATERIAL 0				//white diffuse
+RGB         1 1 1       
+SPECEX      0      
+SPECRGB     1 1 1      
+REFL        0       
+REFR        0        
+REFRIOR     0       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 1 				//red diffuse
+RGB         .63 .06 .04       
+SPECEX      0      
+SPECRGB     1 1 1      
+REFL        0       
+REFR        0        
+REFRIOR     0       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 2 				//green diffuse
+RGB         .15 .48 .09      
+SPECEX      0      
+SPECRGB     1 1 1      
+REFL        0       
+REFR        0        
+REFRIOR     0       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 3 				//red glossy
+RGB         .63 .06 .04      
+SPECEX      0      
+SPECRGB     1 1 1       
+REFL        0.5       
+REFR        0        
+REFRIOR     2       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 4 				//white glossy
+RGB         1 1 1     
+SPECEX      0      
+SPECRGB     1 1 1      
+REFL        0.5       
+REFR        0        
+REFRIOR     2      
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 5 				//glass
+RGB         1 1 1    
+SPECEX      0      
+SPECRGB     1 1 1      
+REFL        1       
+REFR        1        
+REFRIOR     2.2       
+SCATTER     0        
+ABSCOEFF    .02 5.1 5.7      
+RSCTCOEFF   13
+EMITTANCE   0
+
+MATERIAL 6 				//green glossy
+RGB         .15 .48 .09      
+SPECEX      0      
+SPECRGB     1 1 1     
+REFL        0       
+REFR        0        
+REFRIOR     2.6       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   0
+
+MATERIAL 7				//light
+RGB         1 1 1       
+SPECEX      0      
+SPECRGB     0 0 0       
+REFL        0       
+REFR        0        
+REFRIOR     0       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   1
+
+MATERIAL 8				//light
+RGB         1 1 1       
+SPECEX      0      
+SPECRGB     0 0 0       
+REFL        0       
+REFR        0        
+REFRIOR     0       
+SCATTER     0        
+ABSCOEFF    0 0 0      
+RSCTCOEFF   0
+EMITTANCE   15
+
+CAMERA
+RES         800 800
+FOVY        25
+ITERATIONS  5000
+FILE        test.bmp
+frame 0
+EYE         0 4.5 12
+VIEW        0 0 -1
+UP          0 1 0
+
+OBJECT 0
+cube
+material 0
+frame 0
+TRANS       0 0 0
+ROTAT       0 0 90
+SCALE       .01 10 10 
+
+OBJECT 1
+cube
+material 0
+frame 0
+TRANS       0 5 -5
+ROTAT       0 90 0
+SCALE       .01 10 10 
+
+OBJECT 2
+cube
+material 0
+frame 0
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .01 10 10
+
+OBJECT 3
+cube
+material 1
+frame 0
+TRANS       -5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+OBJECT 4
+cube
+material 2
+frame 0
+TRANS       5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+OBJECT 5
+sphere
+material 1
+frame 0
+TRANS       0 2 0
+ROTAT       0 180 0
+SCALE       3 3 3
+
+OBJECT 6
+sphere
+material 0
+frame 0
+TRANS       2 5 2
+ROTAT       0 180 0
+SCALE       2.5 2.5 2.5
+
+OBJECT 7
+sphere
+material 2
+frame 0
+TRANS       -2 5 -2
+ROTAT       0 180 0
+SCALE       3 3 3
+
+
+OBJECT 8
+cube
+material 8 
+frame 0
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .3 3 3
\ No newline at end of file
diff --git a/performance.docx b/performance.docx
new file mode 100644
index 0000000..e69de29
diff --git a/performance.pdf b/performance.pdf
new file mode 100644
index 0000000..49a011f
Binary files /dev/null and b/performance.pdf differ
diff --git a/screenShoots/bug.jpg b/screenShoots/bug.jpg
new file mode 100644
index 0000000..0e02867
Binary files /dev/null and b/screenShoots/bug.jpg differ
diff --git a/screenShoots/diffuse.jpg b/screenShoots/diffuse.jpg
new file mode 100644
index 0000000..c6f9c3d
Binary files /dev/null and b/screenShoots/diffuse.jpg differ
diff --git a/screenShoots/dof.jpg b/screenShoots/dof.jpg
new file mode 100644
index 0000000..57af99d
Binary files /dev/null and b/screenShoots/dof.jpg differ
diff --git a/screenShoots/fract.jpg b/screenShoots/fract.jpg
new file mode 100644
index 0000000..4cc58b2
Binary files /dev/null and b/screenShoots/fract.jpg differ
diff --git a/screenShoots/p.jpg b/screenShoots/p.jpg
new file mode 100644
index 0000000..e84eb43
Binary files /dev/null and b/screenShoots/p.jpg differ
diff --git a/src/intersections.h b/src/intersections.h
index c9eafb6..cd25853 100644
--- a/src/intersections.h
+++ b/src/intersections.h
@@ -13,6 +13,7 @@
 #include "cudaMat4.h"
 #include "utilities.h"
 
+#define THRESHOLD 0.0001
 // Some forward declarations
 __host__ __device__ glm::vec3 getPointOnRay(ray r, float t);
 __host__ __device__ glm::vec3 multiplyMV(cudaMat4 m, glm::vec4 v);
@@ -20,6 +21,8 @@ __host__ __device__ glm::vec3 getSignOfRay(ray r);
 __host__ __device__ glm::vec3 getInverseDirectionOfRay(ray r);
 __host__ __device__ float boxIntersectionTest(staticGeom sphere, ray r, glm::vec3& intersectionPoint, glm::vec3& normal);
 __host__ __device__ float sphereIntersectionTest(staticGeom sphere, ray r, glm::vec3& intersectionPoint, glm::vec3& normal);
+__host__ __device__ void IntersectionTest(staticGeom *geoms, ray r, glm::vec3& intersectionPoint, glm::vec3& normal, int &matId, int numberOfGeoms);
+__host__ __device__ float geomIntersectionTest(staticGeom geom, ray r, glm::vec3& intersectionPoint, glm::vec3& normal);
 __host__ __device__ glm::vec3 getRandomPointOnCube(staticGeom cube, float randomSeed);
 
 // Handy dandy little hashing function that provides seeds for random number generation
@@ -69,51 +72,172 @@ __host__ __device__ glm::vec3 getSignOfRay(ray r){
   return glm::vec3((int)(inv_direction.x < 0), (int)(inv_direction.y < 0), (int)(inv_direction.z < 0));
 }
 
+__host__ __device__ void IntersectionTest(staticGeom *geoms, ray r, glm::vec3& intersectionPoint, glm::vec3& normal, int &matId, int numberOfGeoms) {
+	float minDist = FLT_MAX;
+	glm::vec3 intersection, minNormal;
+	float dist;
+
+	for (int i=0; i<numberOfGeoms; ++i) {
+		dist = geomIntersectionTest(geoms[i], r, intersection, minNormal);
+		if (dist > THRESHOLD && dist < minDist) {
+			matId = geoms[i].materialid;
+			intersectionPoint = intersection;
+			normal = minNormal;
+		}
+	}
+	
+}
+__host__ __device__ float geomIntersectionTest(staticGeom geom, ray r, glm::vec3& intersectionPoint, glm::vec3& normal) {
+	switch (geom.type) {
+		case GEOMTYPE::SPHERE:
+			return sphereIntersectionTest(geom, r, intersectionPoint, normal);
+		case GEOMTYPE::CUBE:
+			return boxIntersectionTest(geom, r, intersectionPoint, normal);
+		case GEOMTYPE::MESH:
+			// TODO: optimize this, maybe remove meshes from the geometry list
+			return -1; // since mesh intersection test is done by triangle intersection tests
+		default:
+			return -1;
+	}
+}
 // TODO: IMPLEMENT THIS FUNCTION
 // Cube intersection test, return -1 if no intersection, otherwise, distance to intersection
 __host__ __device__ float boxIntersectionTest(staticGeom box, ray r, glm::vec3& intersectionPoint, glm::vec3& normal){
+	glm::vec3 ro = multiplyMV(box.inverseTransform, glm::vec4(r.origin,1.0f));
+	glm::vec3 rd = glm::normalize(multiplyMV(box.inverseTransform, glm::vec4(r.direction,0.0f))); // why not transpose(inverse)?
+
+	// ray box intersection
+	bool intersect = true;
+	float tnear = -FLT_MAX;
+	float tfar = FLT_MAX;
+	float t1, t2;
+	// left and right faces
+	if (abs(rd[0]) > THRESHOLD) {
+		t1 = (-0.5 - ro[0]) / rd[0];
+		t2 = (0.5 - ro[0]) / rd[0];
+		if (t1 > t2) {
+			float temp = t1;
+			t1 = t2;
+			t2 = temp;
+		}
+		if (tnear < t1) {
+			tnear = t1;
+		}
+		if (tfar > t2) {
+			tfar = t2;
+		}
+	}
+	else if (ro[0] <= -0.5 || ro[0] >= 0.5) {
+		intersect = false;
+	}
+	// top and bottom faces
+	if (abs(rd[1]) > THRESHOLD) {
+		t1 = (-0.5 - ro[1]) / rd[1];
+		t2 = (0.5 - ro[1]) / rd[1];
+		if (t1 > t2) {
+			float temp = t1;
+			t1 = t2;
+			t2 = temp;
+		}
+		if (tnear < t1) {
+			tnear = t1;
+		}
+		if (tfar > t2) {
+			tfar = t2;
+		}
+	}
+	else if (ro[1] <= -0.5 || ro[1] >= 0.5) {
+		intersect = false;
+	}
+	// front and rear faces
+	if (abs(rd[2]) > THRESHOLD) {
+		t1 = (-0.5 - ro[2]) / rd[2];
+		t2 = (0.5 - ro[2]) / rd[2];
+		if (t1 > t2) {
+			float temp = t1;
+			t1 = t2;
+			t2 = temp;
+		}
+		if (tnear < t1) {
+			tnear = t1;
+		}
+		if (tfar > t2) {
+			tfar = t2;
+		}
+	}
+	else if (ro[2] <= -0.5 || ro[2] >= 0.5) {
+		intersect = false;
+	}
+	if (!intersect || tnear > tfar + THRESHOLD || tnear < THRESHOLD) {
+		return -1;
+	}
+	else {
+		glm::vec3 p = ro + rd * tnear;
+		glm::vec3 n;
+		if (abs(p[0]+0.5) < THRESHOLD) {
+			n = glm::vec3(-1, 0, 0);
+		}
+		else if (abs(p[0]-0.5) < THRESHOLD) {
+			n = glm::vec3(1, 0, 0);
+		}
+		else if (abs(p[1]+0.5) < THRESHOLD) {
+			n = glm::vec3(0, -1, 0);
+		}
+		else if (abs(p[1]-0.5) < THRESHOLD) {
+			n = glm::vec3(0, 1, 0);
+		}
+		else if (abs(p[2]+0.5) < THRESHOLD) {
+			n = glm::vec3(0, 0, -1);
+		}
+		else {
+			n = glm::vec3(0, 0, 1);
+		}
+
+		intersectionPoint = multiplyMV(box.transform, glm::vec4(p, 1.0f));
+		normal = glm::normalize(multiplyMV(box.transform, glm::vec4(n, 0.0f)));
 
-    return -1;
+		return glm::length(r.origin - intersectionPoint);
+	}
 }
 
 // LOOK: Here's an intersection test example from a sphere. Now you just need to figure out cube and, optionally, triangle.
 // Sphere intersection test, return -1 if no intersection, otherwise, distance to intersection
 __host__ __device__ float sphereIntersectionTest(staticGeom sphere, ray r, glm::vec3& intersectionPoint, glm::vec3& normal){
   
-  float radius = .5;
+	  float radius = .5;
         
-  glm::vec3 ro = multiplyMV(sphere.inverseTransform, glm::vec4(r.origin,1.0f));
-  glm::vec3 rd = glm::normalize(multiplyMV(sphere.inverseTransform, glm::vec4(r.direction,0.0f)));
+	  glm::vec3 ro = multiplyMV(sphere.inverseTransform, glm::vec4(r.origin,1.0f));
+	  glm::vec3 rd = glm::normalize(multiplyMV(sphere.inverseTransform, glm::vec4(r.direction,0.0f)));
 
-  ray rt; rt.origin = ro; rt.direction = rd;
+	  ray rt; rt.origin = ro; rt.direction = rd;
   
-  float vDotDirection = glm::dot(rt.origin, rt.direction);
-  float radicand = vDotDirection * vDotDirection - (glm::dot(rt.origin, rt.origin) - pow(radius, 2));
-  if (radicand < 0){
-    return -1;
-  }
+	  float vDotDirection = glm::dot(rt.origin, rt.direction);
+	  float radicand = vDotDirection * vDotDirection - (glm::dot(rt.origin, rt.origin) - pow(radius, 2));
+	  if (radicand < 0){
+		return -1;
+	  }
   
-  float squareRoot = sqrt(radicand);
-  float firstTerm = -vDotDirection;
-  float t1 = firstTerm + squareRoot;
-  float t2 = firstTerm - squareRoot;
+	  float squareRoot = sqrt(radicand);
+	  float firstTerm = -vDotDirection;
+	  float t1 = firstTerm + squareRoot;
+	  float t2 = firstTerm - squareRoot;
   
-  float t = 0;
-  if (t1 < 0 && t2 < 0) {
-      return -1;
-  } else if (t1 > 0 && t2 > 0) {
-      t = min(t1, t2);
-  } else {
-      t = max(t1, t2);
-  }
-
-  glm::vec3 realIntersectionPoint = multiplyMV(sphere.transform, glm::vec4(getPointOnRay(rt, t), 1.0));
-  glm::vec3 realOrigin = multiplyMV(sphere.transform, glm::vec4(0,0,0,1));
-
-  intersectionPoint = realIntersectionPoint;
-  normal = glm::normalize(realIntersectionPoint - realOrigin);
+	  float t = 0;
+	  if (t1 < 0 && t2 < 0) {
+		  return -1;
+	  } else if (t1 > 0 && t2 > 0) {
+		  t = t1 > t2 ? t2: t1;
+	  } else {
+		  t = t1 > t2 ? t1: t2;
+	  }
+
+	  glm::vec3 realIntersectionPoint = multiplyMV(sphere.transform, glm::vec4(getPointOnRay(rt, t), 1.0));
+	  glm::vec3 realOrigin = multiplyMV(sphere.transform, glm::vec4(0,0,0,1));
+
+	  intersectionPoint = realIntersectionPoint;
+	  normal = glm::normalize(realIntersectionPoint - realOrigin);
         
-  return glm::length(r.origin - realIntersectionPoint);
+	  return glm::length(r.origin - realIntersectionPoint);
 }
 
 // Returns x,y,z half-dimensions of tightest bounding box
@@ -178,9 +302,7 @@ __host__ __device__ glm::vec3 getRandomPointOnCube(staticGeom cube, float random
 // Generates a random point on a given sphere
 __host__ __device__ glm::vec3 getRandomPointOnSphere(staticGeom sphere, float randomSeed){
 
-  return glm::vec3(0,0,0);
+	return glm::vec3(0,0,0);
 }
 
 #endif
-
-
diff --git a/src/main.cpp b/src/main.cpp
index 7a1098f..3559c4c 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -4,7 +4,7 @@
 //       Rob Farber for CUDA-GL interop, from CUDA Supercomputing For The Masses: http://www.drdobbs.com/architecture-and-design/cuda-supercomputing-for-the-masses-part/222600097
 //       Varun Sampath and Patrick Cozzi for GLSL Loading, from CIS565 Spring 2012 HW5 at the University of Pennsylvania: http://cis565-spring-2012.github.com/
 //       Yining Karl Li's TAKUA Render, a massively parallel pathtracing renderer: http://www.yiningkarlli.com
-
+#include <ctime>
 #include "main.h"
 #define GLEW_STATIC
 
@@ -12,6 +12,8 @@
 //-------------MAIN--------------
 //-------------------------------
 
+std::clock_t beginc;
+std::clock_t endc;
 int main(int argc, char** argv){
   #ifdef __APPLE__
     // Needed in OSX to force use of OpenGL3.2 
@@ -98,6 +100,14 @@ void runCuda(){
   // No data is moved (Win & Linux). When mapped to CUDA, OpenGL should not use this buffer
   
   if(iterations < renderCam->iterations){
+	  if(iterations == 1){
+		 beginc = clock();
+	  }
+	  if(iterations == 10){
+		endc = clock();
+		double time = (endc - beginc)/(CLOCKS_PER_SEC / 1000.0);
+		printf(" %.4f ms \n", time);
+	  }
     uchar4 *dptr=NULL;
     iterations++;
     cudaGLMapBufferObject((void**)&dptr, pbo);
diff --git a/src/raytraceKernel.cu b/src/raytraceKernel.cu
index 9c7bc7d..625273b 100644
--- a/src/raytraceKernel.cu
+++ b/src/raytraceKernel.cu
@@ -8,6 +8,9 @@
 #include <stdio.h>
 #include <cuda.h>
 #include <cmath>
+#include <thrust/random.h>
+#include <thrust/device_ptr.h>
+#include <thrust/remove.h>
 
 #include "sceneStructs.h"
 #include "glm/glm.hpp"
@@ -16,6 +19,10 @@
 #include "intersections.h"
 #include "interactions.h"
 
+#define ANTI_ALIAS 1
+#define MAX_DEPTH 3
+
+#define DOFLENGTH	12
 void checkCUDAError(const char *msg) {
   cudaError_t err = cudaGetLastError();
   if( cudaSuccess != err) {
@@ -23,7 +30,13 @@ void checkCUDAError(const char *msg) {
     exit(EXIT_FAILURE); 
   }
 } 
-
+struct isDead{
+	__host__ __device__ 
+	bool operator()(const ray r)
+	{
+		return r.active == false;
+	}
+};
 // LOOK: This function demonstrates how to use thrust for random number generation on the GPU!
 // Function that generates static.
 __host__ __device__ glm::vec3 generateRandomNumberFromThread(glm::vec2 resolution, float time, int x, int y){
@@ -38,10 +51,36 @@ __host__ __device__ glm::vec3 generateRandomNumberFromThread(glm::vec2 resolutio
 // TODO: IMPLEMENT THIS FUNCTION
 // Function that does the initial raycast from the camera
 __host__ __device__ ray raycastFromCameraKernel(glm::vec2 resolution, float time, int x, int y, glm::vec3 eye, glm::vec3 view, glm::vec3 up, glm::vec2 fov){
-  ray r;
-  r.origin = glm::vec3(0,0,0);
-  r.direction = glm::vec3(0,0,-1);
+	ray r;
+	r.origin = eye;
+	r.active = true;
+
+	int index = y * resolution.x + x;
+    float phi = glm::radians(fov.y);
+	float theta = glm::radians(fov.x);
+	glm::vec3 A = glm::normalize(glm::cross(view, up));
+	glm::vec3 B = glm::normalize(glm::cross(A, view));
+	glm::vec3 M = eye + view;
+	glm::vec3 V = B * glm::length(view) * tan(phi);
+	glm::vec3 H = A * glm::length(view) * tan(theta);
+
+	// super sampling for anti-aliasing
+	thrust::default_random_engine rng(hash(time*index));
+	thrust::uniform_real_distribution<float> u01(0, 1);
+	float fx = x + (float)u01(rng);
+	float fy = y + (float)u01(rng);
+
+	glm::vec3 P = M + (2*fx/(resolution.x-1)-1) * H + (2*(1-fy/(resolution.y-1))-1) * V;
+	r.direction = glm::normalize(P-eye);
+	//depth of field
+	/*thrust::uniform_real_distribution<float> u02(-0.3,0.3);
+	glm::vec3 aimPoint = r.origin + (float)DOFLENGTH * r.direction;
+	r.origin += glm::vec3(u02(rng),u02(rng),u02(rng));
+	r.direction = aimPoint - r.origin;
+	r.direction = glm::normalize(r.direction);*/
+
   return r;
+  
 }
 
 //Kernel that blacks out a given image buffer
@@ -55,7 +94,7 @@ __global__ void clearImage(glm::vec2 resolution, glm::vec3* image){
 }
 
 //Kernel that writes the image to the OpenGL PBO directly.
-__global__ void sendImageToPBO(uchar4* PBOpos, glm::vec2 resolution, glm::vec3* image){
+__global__ void sendImageToPBO(uchar4* PBOpos, glm::vec2 resolution, glm::vec3* image, float i){
   
   int x = (blockIdx.x * blockDim.x) + threadIdx.x;
   int y = (blockIdx.y * blockDim.y) + threadIdx.y;
@@ -64,9 +103,9 @@ __global__ void sendImageToPBO(uchar4* PBOpos, glm::vec2 resolution, glm::vec3*
   if(x<=resolution.x && y<=resolution.y){
 
       glm::vec3 color;
-      color.x = image[index].x*255.0;
-      color.y = image[index].y*255.0;
-      color.z = image[index].z*255.0;
+      color.x = image[index].x*255.0/i;
+      color.y = image[index].y*255.0/i;
+      color.z = image[index].z*255.0/i;
 
       if(color.x>255){
         color.x = 255;
@@ -88,18 +127,147 @@ __global__ void sendImageToPBO(uchar4* PBOpos, glm::vec2 resolution, glm::vec3*
   }
 }
 
+__global__ void generateRay(cameraData cam, float time, ray* raypool) {
+	int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+	int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+	int index = x + (y * cam.resolution.x);
+
+	if((x<=cam.resolution.x && y<=cam.resolution.y)){
+		raypool[index] = raycastFromCameraKernel(cam.resolution, time, x, y, cam.position, cam.view, cam.up, cam.fov);
+		raypool[index].index = index;
+		raypool[index].color = glm::vec3(1, 1, 1);
+	}
+}
+__host__ __device__ glm::vec3 getReflectedRay(glm::vec3 d, glm::vec3 n) {
+	glm::vec3 VR; // reflected ray direction
+	if (glm::length(-d - n) < EPSILON) {
+		VR = n;
+	}
+	else if (abs(glm::dot(-d, n)) < EPSILON) {
+		VR = d;
+	}
+	else {
+		VR = glm::normalize(d - 2.0f * glm::dot(d, n) * n);
+	}
+	return VR;
+}
+
+// Get the refracted ray direction from ray direction, normal and index of refraction (IOR)
+__host__ __device__ glm::vec3 getRefractedRay(glm::vec3 d, glm::vec3 n, float IOR) {
+	glm::vec3 VT; // refracted ray direction
+	float t = 1 / IOR;
+	float base = 1 - t * t * (1 - pow(glm::dot(n, d), 2));
+	if (base < 0) {
+		 VT = glm::vec3(0, 0, 0);
+	}
+	else {
+		VT = (-t * glm::dot(n, d) - sqrt(base)) * n + t * d; // refracted ray
+		VT = glm::normalize(VT);
+	}
+	return VT;
+}
+__host__ __device__ bool notDiffuseRay(float randomSeed, float hasReflect) {
+	// determine if ray is reflected according to the proportion
+	thrust::default_random_engine rng(hash(randomSeed));
+	thrust::uniform_real_distribution<float> u01(0,1);
+	if (u01(rng) > hasReflect) {
+		return true;
+	}
+	else {
+		return false;
+	}
+}
+
+// Determine if the randomly generated ray is a refracted ray or a reflected ray
+__host__ __device__  bool isRefractedRay(float randomSeed, float IOR, glm::vec3 d, glm::vec3 n, glm::vec3 t) {
+	float rpar = (IOR * glm::dot(n, d) - glm::dot(n, t)) / (IOR * glm::dot(n, d) + glm::dot(n, t));
+	float rperp = (glm::dot(n, d) - IOR * glm::dot(n, t)) / (glm::dot(n, d) + IOR * glm::dot(n, t));
+
+	// compute proportion of the light reflected
+	float fr = 0.5 * (rpar * rpar + rperp * rperp);
+
+	// determine if ray is reflected according to the proportion
+	thrust::default_random_engine rng(hash(randomSeed));
+	thrust::uniform_real_distribution<float> u01(0,1);
+	if (u01(rng) <= fr) {
+		return false;
+	}
+	else {
+		return true;
+	}
+}
+
 // TODO: IMPLEMENT THIS FUNCTION
 // Core raytracer kernel
 __global__ void raytraceRay(glm::vec2 resolution, float time, cameraData cam, int rayDepth, glm::vec3* colors,
-                            staticGeom* geoms, int numberOfGeoms){
+                            staticGeom* geoms, int numberOfGeoms, material *materials, ray* rays, int numberOfRays){
 
-  int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+  /*int x = (blockIdx.x * blockDim.x) + threadIdx.x;
   int y = (blockIdx.y * blockDim.y) + threadIdx.y;
-  int index = x + (y * resolution.x);
+  int index = x + (y * resolution.x);*/
+  int rayIdx = (blockIdx.x * blockDim.x) + threadIdx.x;
+  if(rayIdx < numberOfRays){
+		int pixelIdx = rays[rayIdx].index;
+		float seed = time * rayIdx * (rayDepth+1);
+		if(rays[rayIdx].active){
+			int matIdx = -1;
+			glm::vec3 interPoint, normal;				
+			IntersectionTest(geoms, rays[rayIdx], interPoint, normal, matIdx, numberOfGeoms);
 
-  if((x<=resolution.x && y<=resolution.y)){
+			if(matIdx != -1){
+				material mat1 = materials[matIdx];
+				if(mat1.emittance > EPSILON){
+					glm::vec3 color = rays[rayIdx].color * mat1.color * mat1.emittance;
+					colors[pixelIdx] += color;
+					rays[rayIdx].active = false;
+					//colors[pixelIdx] = normal;
+				}
+				else{
+					
+					
+					if(mat1.hasReflective > EPSILON || mat1.hasRefractive > EPSILON){
+						//if (notDiffuseRay(seed, mat1.hasReflective)) {
+							float IOR = mat1.indexOfRefraction;//Index of Refraction
+							if (glm::dot(rays[rayIdx].direction, normal) > 0) { // reverse normal and index of refraction if ray inside the object
+								normal *= -1;
+								IOR = 1/(IOR + EPSILON);
+							}
+							if (mat1.hasRefractive > EPSILON) { // if the surface has refraction
+								glm::vec3 dir = getRefractedRay(rays[rayIdx].direction, normal, IOR);
+								if (glm::length(dir) > EPSILON && (mat1.hasReflective < EPSILON|| isRefractedRay(seed, IOR, rays[rayIdx].direction, normal, dir))) {
+									rays[rayIdx].direction = dir;
+									rays[rayIdx].origin = interPoint + dir * (float)EPSILON;
+									rays[rayIdx].color *= mat1.color;
+									return;
+								}
+							}
+							// if the surface only has reflection
+							glm::vec3 dir2 = getReflectedRay(rays[rayIdx].direction, normal);
+							rays[rayIdx].origin = interPoint + dir2 * (float)EPSILON;
+							rays[rayIdx].direction = dir2;
+							rays[rayIdx].color *= mat1.color;
+							return;
+						 //}
+					}
+					if (glm::dot(rays[rayIdx].direction, normal) > 0) { // reverse normal if we are inside the object
+						normal *= -1;
+					}
+					//diffuse
+					thrust::default_random_engine rng(hash(seed));
+					thrust::uniform_real_distribution<float> u01(0, 1);
 
-    colors[index] = generateRandomNumberFromThread(resolution, time, x, y);
+					rays[rayIdx].direction = glm::normalize(calculateRandomDirectionInHemisphere(normal, (float)u01(rng), (float)u01(rng)));
+					rays[rayIdx].origin = interPoint + rays[rayIdx].direction * (float)EPSILON;
+					rays[rayIdx].color = rays[rayIdx].color * mat1.color;
+					
+				}
+			}
+			else{
+				//rays[rayIdx].color = glm::vec3(0,0,0);
+				rays[rayIdx].active = false;
+			}
+		}
+		
    }
 }
 
@@ -107,59 +275,81 @@ __global__ void raytraceRay(glm::vec2 resolution, float time, cameraData cam, in
 // Wrapper for the __global__ call that sets up the kernel calls and does a ton of memory management
 void cudaRaytraceCore(uchar4* PBOpos, camera* renderCam, int frame, int iterations, material* materials, int numberOfMaterials, geom* geoms, int numberOfGeoms){
   
-  int traceDepth = 1; //determines how many bounces the raytracer traces
+	
+	
+	int traceDepth; //determines how many bounces the raytracer traces
 
-  // set up crucial magic
-  int tileSize = 8;
-  dim3 threadsPerBlock(tileSize, tileSize);
-  dim3 fullBlocksPerGrid((int)ceil(float(renderCam->resolution.x)/float(tileSize)), (int)ceil(float(renderCam->resolution.y)/float(tileSize)));
-  
-  // send image to GPU
-  glm::vec3* cudaimage = NULL;
-  cudaMalloc((void**)&cudaimage, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3));
-  cudaMemcpy( cudaimage, renderCam->image, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3), cudaMemcpyHostToDevice);
+	// set up crucial magic
+	int tileSize = 8;
+	dim3 threadsPerBlock(tileSize, tileSize);
+	dim3 fullBlocksPerGrid((int)ceil(float(renderCam->resolution.x)/float(tileSize)), (int)ceil(float(renderCam->resolution.y)/float(tileSize)));
   
-  // package geometry and materials and sent to GPU
-  staticGeom* geomList = new staticGeom[numberOfGeoms];
-  for(int i=0; i<numberOfGeoms; i++){
-    staticGeom newStaticGeom;
-    newStaticGeom.type = geoms[i].type;
-    newStaticGeom.materialid = geoms[i].materialid;
-    newStaticGeom.translation = geoms[i].translations[frame];
-    newStaticGeom.rotation = geoms[i].rotations[frame];
-    newStaticGeom.scale = geoms[i].scales[frame];
-    newStaticGeom.transform = geoms[i].transforms[frame];
-    newStaticGeom.inverseTransform = geoms[i].inverseTransforms[frame];
-    geomList[i] = newStaticGeom;
-  }
+	int numberOfRays = (int)renderCam->resolution.x*(int)renderCam->resolution.y;
+	ray *raypool1;
+	cudaMalloc((void**)&raypool1, numberOfRays * sizeof(ray));
+	
+    // send image to GPU
+	glm::vec3* cudaimage = NULL;
+	cudaMalloc((void**)&cudaimage, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3));
+	cudaMemcpy( cudaimage, renderCam->image, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3), cudaMemcpyHostToDevice);
+    
+	// package geometry and materials and sent to GPU
+	staticGeom* geomList = new staticGeom[numberOfGeoms];
+
+	for(int i=0; i<numberOfGeoms; i++){
+		staticGeom newStaticGeom;
+		newStaticGeom.type = geoms[i].type;
+		newStaticGeom.materialid = geoms[i].materialid;
+		newStaticGeom.translation = geoms[i].translations[frame];
+		newStaticGeom.rotation = geoms[i].rotations[frame];
+		newStaticGeom.scale = geoms[i].scales[frame];
+		newStaticGeom.transform = geoms[i].transforms[frame];
+		newStaticGeom.inverseTransform = geoms[i].inverseTransforms[frame];
+		geomList[i] = newStaticGeom;
+	}
   
-  staticGeom* cudageoms = NULL;
-  cudaMalloc((void**)&cudageoms, numberOfGeoms*sizeof(staticGeom));
-  cudaMemcpy( cudageoms, geomList, numberOfGeoms*sizeof(staticGeom), cudaMemcpyHostToDevice);
+	staticGeom* cudageoms = NULL;
+	cudaMalloc((void**)&cudageoms, numberOfGeoms*sizeof(staticGeom));
+	cudaMemcpy( cudageoms, geomList, numberOfGeoms*sizeof(staticGeom), cudaMemcpyHostToDevice);
   
-  // package camera
-  cameraData cam;
-  cam.resolution = renderCam->resolution;
-  cam.position = renderCam->positions[frame];
-  cam.view = renderCam->views[frame];
-  cam.up = renderCam->ups[frame];
-  cam.fov = renderCam->fov;
-
-  // kernel launches
-  raytraceRay<<<fullBlocksPerGrid, threadsPerBlock>>>(renderCam->resolution, (float)iterations, cam, traceDepth, cudaimage, cudageoms, numberOfGeoms);
+	// package camera
+	cameraData cam;
+	cam.resolution = renderCam->resolution;
+	cam.position = renderCam->positions[frame];
+	cam.view = renderCam->views[frame];
+	cam.up = renderCam->ups[frame];
+	cam.fov = renderCam->fov;
+	//cache material
+	material* cudamtls;
+	cudaMalloc((void**)&cudamtls, numberOfMaterials*sizeof(material));
+	cudaMemcpy(cudamtls, materials, numberOfMaterials*sizeof(material), cudaMemcpyHostToDevice);
 
-  sendImageToPBO<<<fullBlocksPerGrid, threadsPerBlock>>>(PBOpos, renderCam->resolution, cudaimage);
+	generateRay<<<fullBlocksPerGrid, threadsPerBlock>>>(cam, (float)iterations, raypool1);
+	// kernel launches
+	int threadPerBlock = 128;//TODO tweak
+    int blockPerGrid = (int)ceil((float)numberOfRays/threadPerBlock);
+	for(traceDepth = 0; traceDepth < MAX_DEPTH; traceDepth++){
+		raytraceRay<<<blockPerGrid, threadPerBlock>>>(renderCam->resolution, (float)iterations, cam, traceDepth, cudaimage, cudageoms, numberOfGeoms, cudamtls, raypool1, numberOfRays);
+		cudaDeviceSynchronize();
+		/*thrust::device_ptr<ray> rayPoolStart = thrust::device_pointer_cast(raypool1);
+	    thrust::device_ptr<ray> rayPoolEnd = thrust::remove_if(rayPoolStart,rayPoolStart+numberOfRays,isDead());
+	    numberOfRays = (int)( rayPoolEnd - rayPoolStart);*/
+		if(numberOfRays <= 0) break;
+	}
+	sendImageToPBO<<<fullBlocksPerGrid, threadsPerBlock>>>(PBOpos, renderCam->resolution, cudaimage, (float)iterations);
 
-  // retrieve image from GPU
-  cudaMemcpy( renderCam->image, cudaimage, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3), cudaMemcpyDeviceToHost);
+	// retrieve image from GPU
+	cudaMemcpy( renderCam->image, cudaimage, (int)renderCam->resolution.x*(int)renderCam->resolution.y*sizeof(glm::vec3), cudaMemcpyDeviceToHost);
 
-  // free up stuff, or else we'll leak memory like a madman
-  cudaFree( cudaimage );
-  cudaFree( cudageoms );
-  delete geomList;
+	// free up stuff
+	cudaFree( cudaimage );
+	cudaFree( cudageoms );
+	cudaFree( raypool1 );
+	cudaFree( cudamtls );
+	delete[] geomList;
 
-  // make certain the kernel has completed
-  cudaThreadSynchronize();
+	
+	cudaThreadSynchronize();
 
-  checkCUDAError("Kernel failed!");
+	checkCUDAError("Kernel failed!");
 }
diff --git a/src/sceneStructs.h b/src/sceneStructs.h
index 5e0c853..d865c4d 100644
--- a/src/sceneStructs.h
+++ b/src/sceneStructs.h
@@ -16,6 +16,9 @@ enum GEOMTYPE{ SPHERE, CUBE, MESH };
 struct ray {
 	glm::vec3 origin;
 	glm::vec3 direction;
+	int index;
+	glm::vec3 color;
+	bool active;
 };
 
 struct geom {
diff --git a/windows/Project3-Pathtracer/Project3-Pathtracer/Project3-Pathtracer.vcxproj b/windows/Project3-Pathtracer/Project3-Pathtracer/Project3-Pathtracer.vcxproj
index c45dd79..f05f54d 100644
--- a/windows/Project3-Pathtracer/Project3-Pathtracer/Project3-Pathtracer.vcxproj
+++ b/windows/Project3-Pathtracer/Project3-Pathtracer/Project3-Pathtracer.vcxproj
@@ -28,7 +28,7 @@
   </PropertyGroup>
   <Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
   <ImportGroup Label="ExtensionSettings">
-    <Import Project="$(VCTargetsPath)\BuildCustomizations\CUDA 5.5.props" />
+    <Import Project="$(VCTargetsPath)\BuildCustomizations\CUDA 6.5.props" />
   </ImportGroup>
   <ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
     <Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
@@ -95,6 +95,6 @@
   </ItemGroup>
   <Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
   <ImportGroup Label="ExtensionTargets">
-    <Import Project="$(VCTargetsPath)\BuildCustomizations\CUDA 5.5.targets" />
+    <Import Project="$(VCTargetsPath)\BuildCustomizations\CUDA 6.5.targets" />
   </ImportGroup>
 </Project>
\ No newline at end of file
diff --git a/windows/Project3-Pathtracer/Project3-Pathtracer/test.0.bmp b/windows/Project3-Pathtracer/Project3-Pathtracer/test.0.bmp
new file mode 100644
index 0000000..0244094
Binary files /dev/null and b/windows/Project3-Pathtracer/Project3-Pathtracer/test.0.bmp differ
diff --git a/~$rformance.docx b/~$rformance.docx
new file mode 100644
index 0000000..6e773ce
Binary files /dev/null and b/~$rformance.docx differ