CIS5650-Fall-2025 · Pabloo0610 · Oct 19, 2025 · Oct 19, 2025 · Oct 20, 2025 · Oct 20, 2025
diff --git a/.github/workflows/npm-grunt.yml b/.github/workflows/npm-grunt.yml
@@ -0,0 +1,28 @@
+name: NodeJS with Grunt
+
+on:
+  push:
+    branches: [ "main" ]
+  pull_request:
+    branches: [ "main" ]
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+
+    strategy:
+      matrix:
+        node-version: [18.x, 20.x, 22.x]
+
+    steps:
+    - uses: actions/checkout@v4
+
+    - name: Use Node.js ${{ matrix.node-version }}
+      uses: actions/setup-node@v4
+      with:
+        node-version: ${{ matrix.node-version }}
+
+    - name: Build
+      run: |
+        npm install
+        grunt
diff --git a/README.md b/README.md
@@ -3,25 +3,95 @@ WebGL Forward+ and Clustered Deferred Shading
 
 **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4**
 
-* (TODO) YOUR NAME HERE
-* Tested on: (TODO) **Google Chrome 222.2** on
-  Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
+* Ruichi Zhang
+* Tested on: **Google Chrome 141.0.7390.123** on
+  Windows 10, AMD Ryzen 9 7950X3D @ 4201 Mhz, 16 Core(s), NVIDIA GeForce RTX 4080 SUPER
 
 ### Live Demo
 
-[![](img/thumb.png)](http://TODO.github.io/Project4-WebGPU-Forward-Plus-and-Clustered-Deferred)
+[![](img/proj4teaser.png)](https://pabloo0610.github.io/Project4-WebGPU-Forward-Plus-and-Clustered-Deferred/)
 
 ### Demo Video/GIF
 
-[![](img/video.mp4)](TODO)
+![](img/sponzawebgpu.gif)
 
-### (TODO: Your README)
+## Overview
 
-*DO NOT* leave the README to the last minute! It is a crucial part of the
-project, and we will not be able to grade you without a good README.
+We aim to analyze how different rendering pipelines perform under increasing light counts and varying cluster capacities.  
+This helps identify the practical tradeoffs between **per-fragment lighting**, **Forward+ clustering**, and **deferred lighting**.
 
-This assignment has a considerable amount of performance analysis compared
-to implementation work. Complete the implementation early to leave time!
+---
+
+## Pipeline Overview
+
+### Forward+
+- Light clustering is performed once per frame in a compute pass.
+- Each fragment shades only with the lights affecting its cluster.
+
+### Clustered Deferred
+- G-buffer pass stores position, normal, and albedo.
+- Lighting pass accumulates contributions from clustered lights.
+
+---
+
+## Implementation
+
+- Light clustering in view space with uniform `(X, Y, Z)` grid.
+- Linear Z-slicing for cluster depth partition.
+- Each cluster stores the number of lights and their indices.
+- Sphere-cluster intersection determines light assignment.
+- Cluster indices are computed in the fragment stage from screen position and view-space depth.
+- This minimizes per-fragment light loops and allows thousands of lights to be processed efficiently.
+
+---
+
+## Performance Analysis
+
+### Effect of Light Count
+
+![](img/performance_vs_light_count.png)
+
+*Figure 1: Frame time (ms) vs number of lights for Naive Forward, Forward+, and Clustered Deferred rendering.*
+
+- Naive Forward grows linearly with light count.
+- Forward+ and Clustered Deferred scale sublinearly due to clustering.
+- Deferred performs slightly better than Forward+.
+
+---
+
+### Effect of Lights per Cluster
+
+![](img/performance_vs_lights_per_cluster.png)
+
+*Figure 2: Frame time (ms) vs lights per cluster for Forward+ and Clustered Deferred rendering.*
+
+Increasing the maximum number of lights per cluster directly impacts both performance and rendering quality.
+At lower capacities (e.g., 64 or 128), rendering is fast because fewer light contributions are accumulated per fragment. However, many lights are effectively ignored due to cluster overflow, which leads to incomplete lighting and visual artifacts (e.g., dark areas where lights should contribute).
+At higher capacities (e.g., 512), more lights are correctly processed per cluster, resulting in visually correct lighting but also higher shading cost. In our experiments, 512 provides a good balance between performance and image quality.
+
+---
+
+## Feature Analysis
+
+### Light Clustering
+- Implemented in a compute shader.
+- Complexity grows with `#lights × #clusters`.
+- Linear slicing keeps indexing simple.
+
+### Deferred G-buffer
+- Increases bandwidth usage but reduces fragment shading cost.
+- More stable performance at high light counts.
+
+---
+
+## Conclusion
+
+- Naive Forward is simple but scales poorly with many lights.  
+- Forward+ reduces per-fragment cost through clustering.  
+- Clustered Deferred provides better scalability at high light counts.  
+- The number of lights per cluster significantly affects performance and should be tuned per scene.
+
+---
 
 ### Credits
 

diff --git a/figure.py b/figure.py
@@ -0,0 +1,38 @@
+import matplotlib.pyplot as plt
+import numpy as np
+
+light_counts = np.array([250, 500, 1000, 2500, 5000])
+
+naive_times = np.array([27.7, 52.63, 111, 250, 500])
+forward_times = np.array([5.9, 6.9, 8.69, 12.5, 20.83])
+deferred_times = np.array([5.9, 6.9, 8.3, 8.47, 9.52])
+
+lights_per_cluster = np.array([64, 128, 256, 512])
+forward_cluster_times = np.array([8.84, 12.98, 20, 14.28])
+deferred_cluster_times = np.array([3.8, 6.9, 9.25, 9.61])
+
+plt.figure(figsize=(6,4))
+plt.plot(light_counts, naive_times, marker='o', label='Naive Forward')
+plt.plot(light_counts, forward_times, marker='o', label='Forward+')
+plt.plot(light_counts, deferred_times, marker='o', label='Clustered Deferred')
+
+plt.xlabel('Number of Lights')
+plt.ylabel('Frame Time (ms)')
+plt.title('Frame Time vs Light Count')
+plt.legend()
+plt.grid(True, linestyle='--', alpha=0.5)
+plt.tight_layout()
+plt.savefig('performance_vs_light_count.png', dpi=200)
+
+plt.figure(figsize=(6,4))
+plt.plot(lights_per_cluster, forward_cluster_times, marker='o', label='Forward+')
+plt.plot(lights_per_cluster, deferred_cluster_times, marker='o', label='Clustered Deferred')
+
+plt.xlabel('Lights per Cluster')
+plt.ylabel('Frame Time (ms)')
+plt.xticks(lights_per_cluster)
+plt.title('Frame Time vs Lights per Cluster')
+plt.legend()
+plt.grid(True, linestyle='--', alpha=0.5)
+plt.tight_layout()
+plt.savefig('performance_vs_lights_per_cluster.png', dpi=200)
diff --git a/img/performance_vs_light_count.png b/img/performance_vs_light_count.png
diff --git a/img/performance_vs_lights_per_cluster.png b/img/performance_vs_lights_per_cluster.png
diff --git a/img/proj4teaser.png b/img/proj4teaser.png
diff --git a/img/sponzawebgpu.gif b/img/sponzawebgpu.gif