diff --git a/README.md b/README.md index edffdaf..f22c518 100644 --- a/README.md +++ b/README.md @@ -1,26 +1,62 @@ # Project5-WebGPU-Gaussian-Splat-Viewer -**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5** - -* (TODO) YOUR NAME HERE -* Tested on: (TODO) **Google Chrome 222.2** on - Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Hongyi Ding + * [LinkedIn](https://www.linkedin.com/in/hongyi-ding/), [personal website](https://johnnyding.com/) +* Tested on: Windows 11, Intel(R) Core(TM) Ultra 7 155H @ 3800MHz 16GB, (SEAS Virtual Lab), Intel(R) Arc(TM) Graphics ### Live Demo -[![](img/thumb.png)](http://TODO.github.io/Project4-WebGPU-Forward-Plus-and-Clustered-Deferred) +[WebGPU 3D Gaussian Splat Viewer](https://johnnysist.github.io/Project5-WebGPU-Gaussian-Splat-Viewer/) ### Demo Video/GIF -[![](img/video.mp4)](TODO) +[Demo for WebGPU Gaussian Splat Viewer](https://www.youtube.com/watch?v=dZ3qP298KiE) + +### Overview + +This project implements a WebGPU Gaussian Splat Viewer that can process and render user uploaded 3D point cloud data files in PLY format. The user can also interactively adjust camera parameters and splatting multiplier to see different rendering effects. The original paper is [3D Gaussian Splatting for Real-Time Radiance Field Rendering](https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/), and this blog also helps in implementation: [kwea123/gaussian_splatting_notes](https://github.com/kwea123/gaussian_splatting_notes). + +### Preprocessing & Sorting + +The preprocessing shader `preprocess.wgsl` computes the splat parameters from the input gaussian data, including position, color, conic, radius coefficients, and depth for sorting. It computes the positions in world space, view space and NDC space for the caculation of attributes and passing them to the vertex shader. It uses the rotation and scaling information to compute the covariance matrices in 3D and 2D space, and also the conic coefficients which is useful in the fragment shader. It also computes the color with spherical harmonics data and opacity. Finally, it stores the depth values as an unsigned integer for sorting. + +### Rendering + +The vertex shader will render a quad for each splat. The size of the quad is determined by the radius attribute computed in the preprocessing step. It will also pass the necessary attributes to the fragment shader, including position, color, and conic coefficients. + +The fragment shader computes the Gaussian function based on the conic coefficients and the distance from the splat center. Since the color is already computed in the preprocessing step, it only needs to compute the alpha value, which will decrease as the distance from the center increases. Alpha blend is enabled. + +### Performance Analysis + +- Compare your results from point-cloud and gaussian renderer, what are the differences? + + - As we can see from the screenshots below, the gaussian renderer produces much realistic results compared to the point-cloud renderer. The point-cloud renderer only renders each gaussian as a single point, with the same color. While the gaussian renderer renders each gaussian as a quad with color and opacity, bringing much more details to the scene. However, because the gaussian renderer involves more preprocessing and computing, it is also much slower than the point-cloud renderer. + + ![bicycle_points](images/bicycle_points.png) + + ![bicycle_gaussian](images/bicycle_gaussian.png) + +- For gaussian renderer, how does changing the workgroup-size affect performance? Why do you think this is? + + - An appropriate workgroup size will approximately be within the range of 64 to 512. The performance depends on the GPU device and the shaders. A larger workgroup size can increase parallelism and reduce the overhead caused by scheduling; However, it also requires more resources per workgroup, which may lead to a limitation and slowdown if the memory source turn into shared physical memory, which is much slower than GPU's local memory. + +- Does view-frustum culling give performance improvement? Why do you think this is? + + - Yes, view-frustum culling gives performance improvement. By culling the gaussians that are hidden from the camera, we can save a lot of computation and rendering time. If we render all the gaussians, even for the gaussians that are not visible, it will still cost much time to calculate the positions, colors, radius, to sort and to render them as quads and fragments. + +- Does number of guassians affect performance? Why do you think this is? + + - Yes, the number of gaussians affects performance. The more gaussians there are, the more computation and rendering time is needed. By comparing the rendering time for bonsai.ply (300k gaussians) and bicycle.ply (1M gaussians), we can see that the rendering time for bicycle.ply is much longer than bonsai.ply. It takes more time to initialize the buffers, to preprocess and sort the gaussians, and to render them. + +### Screenshots + +![bonsai_points](images/bonsai_points.png) -### (TODO: Your README) +![bonsai_quads](images/bonsai_quads.png) -*DO NOT* leave the README to the last minute! It is a crucial part of the -project, and we will not be able to grade you without a good README. +![bonsai_3](images/bonsai_3.png) -This assignment has a considerable amount of performance analysis compared -to implementation work. Complete the implementation early to leave time! +![bonsai_4](images/bonsai_4.png) ### Credits diff --git a/images/bicycle_gaussian.png b/images/bicycle_gaussian.png new file mode 100644 index 0000000..0c771af Binary files /dev/null and b/images/bicycle_gaussian.png differ diff --git a/images/bicycle_points.png b/images/bicycle_points.png new file mode 100644 index 0000000..fdb6c1c Binary files /dev/null and b/images/bicycle_points.png differ diff --git a/images/bicycle_white.png b/images/bicycle_white.png new file mode 100644 index 0000000..aab1f11 Binary files /dev/null and b/images/bicycle_white.png differ diff --git a/images/bonsai_1.png b/images/bonsai_1.png new file mode 100644 index 0000000..b2da89c Binary files /dev/null and b/images/bonsai_1.png differ diff --git a/images/bonsai_2.png b/images/bonsai_2.png new file mode 100644 index 0000000..8f92873 Binary files /dev/null and b/images/bonsai_2.png differ diff --git a/images/bonsai_3.png b/images/bonsai_3.png new file mode 100644 index 0000000..0d6da69 Binary files /dev/null and b/images/bonsai_3.png differ diff --git a/images/bonsai_4.png b/images/bonsai_4.png new file mode 100644 index 0000000..2608f4c Binary files /dev/null and b/images/bonsai_4.png differ diff --git a/images/bonsai_points.png b/images/bonsai_points.png new file mode 100644 index 0000000..fde79a4 Binary files /dev/null and b/images/bonsai_points.png differ diff --git a/images/bonsai_quads.png b/images/bonsai_quads.png new file mode 100644 index 0000000..138e68c Binary files /dev/null and b/images/bonsai_quads.png differ diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml new file mode 100644 index 0000000..c730d65 --- /dev/null +++ b/pnpm-lock.yaml @@ -0,0 +1,478 @@ +lockfileVersion: '9.0' + +settings: + autoInstallPeers: true + excludeLinksFromLockfile: false + +importers: + + .: + dependencies: + '@loaders.gl/core': + specifier: ^4.2.2 + version: 4.3.4 + '@loaders.gl/ply': + specifier: ^4.2.2 + version: 4.3.4(@loaders.gl/core@4.3.4) + '@petamoriken/float16': + specifier: ^3.8.7 + version: 3.9.3 + tweakpane: + specifier: ^3.1.8 + version: 3.1.10 + tweakpane-plugin-file-import: + specifier: ^0.2.0 + version: 0.2.1(tweakpane@3.1.10) + wgpu-matrix: + specifier: ^3.2.0 + version: 3.4.0 + devDependencies: + '@tweakpane/core': + specifier: ^1.1.7 + version: 1.1.9 + '@webgpu/types': + specifier: ^0.1.31 + version: 0.1.66 + typescript: + specifier: ^5.0.4 + version: 5.9.3 + vite: + specifier: ^4.3.1 + version: 4.5.14 + vite-raw-plugin: + specifier: ^1.0.1 + version: 1.0.2 + +packages: + + '@esbuild/android-arm64@0.18.20': + resolution: {integrity: sha512-Nz4rJcchGDtENV0eMKUNa6L12zz2zBDXuhj/Vjh18zGqB44Bi7MBMSXjgunJgjRhCmKOjnPuZp4Mb6OKqtMHLQ==} + engines: {node: '>=12'} + cpu: [arm64] + os: [android] + + '@esbuild/android-arm@0.18.20': + resolution: {integrity: sha512-fyi7TDI/ijKKNZTUJAQqiG5T7YjJXgnzkURqmGj13C6dCqckZBLdl4h7bkhHt/t0WP+zO9/zwroDvANaOqO5Sw==} + engines: {node: '>=12'} + cpu: [arm] + os: [android] + + '@esbuild/android-x64@0.18.20': + resolution: {integrity: sha512-8GDdlePJA8D6zlZYJV/jnrRAi6rOiNaCC/JclcXpB+KIuvfBN4owLtgzY2bsxnx666XjJx2kDPUmnTtR8qKQUg==} + engines: {node: '>=12'} + cpu: [x64] + os: [android] + + '@esbuild/darwin-arm64@0.18.20': + resolution: {integrity: sha512-bxRHW5kHU38zS2lPTPOyuyTm+S+eobPUnTNkdJEfAddYgEcll4xkT8DB9d2008DtTbl7uJag2HuE5NZAZgnNEA==} + engines: {node: '>=12'} + cpu: [arm64] + os: [darwin] + + '@esbuild/darwin-x64@0.18.20': + resolution: {integrity: sha512-pc5gxlMDxzm513qPGbCbDukOdsGtKhfxD1zJKXjCCcU7ju50O7MeAZ8c4krSJcOIJGFR+qx21yMMVYwiQvyTyQ==} + engines: {node: '>=12'} + cpu: [x64] + os: [darwin] + + '@esbuild/freebsd-arm64@0.18.20': + resolution: {integrity: sha512-yqDQHy4QHevpMAaxhhIwYPMv1NECwOvIpGCZkECn8w2WFHXjEwrBn3CeNIYsibZ/iZEUemj++M26W3cNR5h+Tw==} + engines: {node: '>=12'} + cpu: [arm64] + os: [freebsd] + + '@esbuild/freebsd-x64@0.18.20': + resolution: {integrity: sha512-tgWRPPuQsd3RmBZwarGVHZQvtzfEBOreNuxEMKFcd5DaDn2PbBxfwLcj4+aenoh7ctXcbXmOQIn8HI6mCSw5MQ==} + engines: {node: '>=12'} + cpu: [x64] + os: [freebsd] + + '@esbuild/linux-arm64@0.18.20': + resolution: {integrity: sha512-2YbscF+UL7SQAVIpnWvYwM+3LskyDmPhe31pE7/aoTMFKKzIc9lLbyGUpmmb8a8AixOL61sQ/mFh3jEjHYFvdA==} + engines: {node: '>=12'} + cpu: [arm64] + os: [linux] + + '@esbuild/linux-arm@0.18.20': + resolution: {integrity: sha512-/5bHkMWnq1EgKr1V+Ybz3s1hWXok7mDFUMQ4cG10AfW3wL02PSZi5kFpYKrptDsgb2WAJIvRcDm+qIvXf/apvg==} + engines: {node: '>=12'} + cpu: [arm] + os: [linux] + + '@esbuild/linux-ia32@0.18.20': + resolution: {integrity: sha512-P4etWwq6IsReT0E1KHU40bOnzMHoH73aXp96Fs8TIT6z9Hu8G6+0SHSw9i2isWrD2nbx2qo5yUqACgdfVGx7TA==} + engines: {node: '>=12'} + cpu: [ia32] + os: [linux] + + '@esbuild/linux-loong64@0.18.20': + resolution: {integrity: sha512-nXW8nqBTrOpDLPgPY9uV+/1DjxoQ7DoB2N8eocyq8I9XuqJ7BiAMDMf9n1xZM9TgW0J8zrquIb/A7s3BJv7rjg==} + engines: {node: '>=12'} + cpu: [loong64] + os: [linux] + + '@esbuild/linux-mips64el@0.18.20': + resolution: {integrity: sha512-d5NeaXZcHp8PzYy5VnXV3VSd2D328Zb+9dEq5HE6bw6+N86JVPExrA6O68OPwobntbNJ0pzCpUFZTo3w0GyetQ==} + engines: {node: '>=12'} + cpu: [mips64el] + os: [linux] + + '@esbuild/linux-ppc64@0.18.20': + resolution: {integrity: sha512-WHPyeScRNcmANnLQkq6AfyXRFr5D6N2sKgkFo2FqguP44Nw2eyDlbTdZwd9GYk98DZG9QItIiTlFLHJHjxP3FA==} + engines: {node: '>=12'} + cpu: [ppc64] + os: [linux] + + '@esbuild/linux-riscv64@0.18.20': + resolution: {integrity: sha512-WSxo6h5ecI5XH34KC7w5veNnKkju3zBRLEQNY7mv5mtBmrP/MjNBCAlsM2u5hDBlS3NGcTQpoBvRzqBcRtpq1A==} + engines: {node: '>=12'} + cpu: [riscv64] + os: [linux] + + '@esbuild/linux-s390x@0.18.20': + resolution: {integrity: sha512-+8231GMs3mAEth6Ja1iK0a1sQ3ohfcpzpRLH8uuc5/KVDFneH6jtAJLFGafpzpMRO6DzJ6AvXKze9LfFMrIHVQ==} + engines: {node: '>=12'} + cpu: [s390x] + os: [linux] + + '@esbuild/linux-x64@0.18.20': + resolution: {integrity: sha512-UYqiqemphJcNsFEskc73jQ7B9jgwjWrSayxawS6UVFZGWrAAtkzjxSqnoclCXxWtfwLdzU+vTpcNYhpn43uP1w==} + engines: {node: '>=12'} + cpu: [x64] + os: [linux] + + '@esbuild/netbsd-x64@0.18.20': + resolution: {integrity: sha512-iO1c++VP6xUBUmltHZoMtCUdPlnPGdBom6IrO4gyKPFFVBKioIImVooR5I83nTew5UOYrk3gIJhbZh8X44y06A==} + engines: {node: '>=12'} + cpu: [x64] + os: [netbsd] + + '@esbuild/openbsd-x64@0.18.20': + resolution: {integrity: sha512-e5e4YSsuQfX4cxcygw/UCPIEP6wbIL+se3sxPdCiMbFLBWu0eiZOJ7WoD+ptCLrmjZBK1Wk7I6D/I3NglUGOxg==} + engines: {node: '>=12'} + cpu: [x64] + os: [openbsd] + + '@esbuild/sunos-x64@0.18.20': + resolution: {integrity: sha512-kDbFRFp0YpTQVVrqUd5FTYmWo45zGaXe0X8E1G/LKFC0v8x0vWrhOWSLITcCn63lmZIxfOMXtCfti/RxN/0wnQ==} + engines: {node: '>=12'} + cpu: [x64] + os: [sunos] + + '@esbuild/win32-arm64@0.18.20': + resolution: {integrity: sha512-ddYFR6ItYgoaq4v4JmQQaAI5s7npztfV4Ag6NrhiaW0RrnOXqBkgwZLofVTlq1daVTQNhtI5oieTvkRPfZrePg==} + engines: {node: '>=12'} + cpu: [arm64] + os: [win32] + + '@esbuild/win32-ia32@0.18.20': + resolution: {integrity: sha512-Wv7QBi3ID/rROT08SABTS7eV4hX26sVduqDOTe1MvGMjNd3EjOz4b7zeexIR62GTIEKrfJXKL9LFxTYgkyeu7g==} + engines: {node: '>=12'} + cpu: [ia32] + os: [win32] + + '@esbuild/win32-x64@0.18.20': + resolution: {integrity: sha512-kTdfRcSiDfQca/y9QIkng02avJ+NCaQvrMejlsB3RRv5sE9rRoeBPISaZpKxHELzRxZyLvNts1P27W3wV+8geQ==} + engines: {node: '>=12'} + cpu: [x64] + os: [win32] + + '@loaders.gl/core@4.3.4': + resolution: {integrity: sha512-cG0C5fMZ1jyW6WCsf4LoHGvaIAJCEVA/ioqKoYRwoSfXkOf+17KupK1OUQyUCw5XoRn+oWA1FulJQOYlXnb9Gw==} + + '@loaders.gl/loader-utils@4.3.4': + resolution: {integrity: sha512-tjMZvlKQSaMl2qmYTAxg+ySR6zd6hQn5n3XaU8+Ehp90TD3WzxvDKOMNDqOa72fFmIV+KgPhcmIJTpq4lAdC4Q==} + peerDependencies: + '@loaders.gl/core': ^4.3.0 + + '@loaders.gl/ply@4.3.4': + resolution: {integrity: sha512-9/ijcIK2xP0cgSM3BmoS5JXfRXe6PKuIGHNteqJHhrQ5nwx1UrupmsYPXj58FykYVZx6PTBshMs9OmIXvPHduw==} + peerDependencies: + '@loaders.gl/core': ^4.3.0 + + '@loaders.gl/schema@4.3.4': + resolution: {integrity: sha512-1YTYoatgzr/6JTxqBLwDiD3AVGwQZheYiQwAimWdRBVB0JAzych7s1yBuE0CVEzj4JDPKOzVAz8KnU1TiBvJGw==} + peerDependencies: + '@loaders.gl/core': ^4.3.0 + + '@loaders.gl/worker-utils@4.3.4': + resolution: {integrity: sha512-EbsszrASgT85GH3B7jkx7YXfQyIYo/rlobwMx6V3ewETapPUwdSAInv+89flnk5n2eu2Lpdeh+2zS6PvqbL2RA==} + peerDependencies: + '@loaders.gl/core': ^4.3.0 + + '@petamoriken/float16@3.9.3': + resolution: {integrity: sha512-8awtpHXCx/bNpFt4mt2xdkgtgVvKqty8VbjHI/WWWQuEw+KLzFot3f4+LkQY9YmOtq7A5GdOnqoIC8Pdygjk2g==} + + '@probe.gl/env@4.1.0': + resolution: {integrity: sha512-5ac2Jm2K72VCs4eSMsM7ykVRrV47w32xOGMvcgqn8vQdEMF9PRXyBGYEV9YbqRKWNKpNKmQJVi4AHM/fkCxs9w==} + + '@probe.gl/log@4.1.0': + resolution: {integrity: sha512-r4gRReNY6f+OZEMgfWEXrAE2qJEt8rX0HsDJQXUBMoc+5H47bdB7f/5HBHAmapK8UydwPKL9wCDoS22rJ0yq7Q==} + + '@probe.gl/stats@4.1.0': + resolution: {integrity: sha512-EI413MkWKBDVNIfLdqbeNSJTs7ToBz/KVGkwi3D+dQrSIkRI2IYbWGAU3xX+D6+CI4ls8ehxMhNpUVMaZggDvQ==} + + '@tweakpane/core@1.1.9': + resolution: {integrity: sha512-9tq+KAhaqPiOgsFyLPAz1IMXkVfhRqxGzAgy1ps3As6o3W7XjnU7sev6OlD/Z+Pzw8uZVMukkSHf2e0uCU6u0A==} + + '@types/geojson@7946.0.16': + resolution: {integrity: sha512-6C8nqWur3j98U6+lXDfTUWIfgvZU+EumvpHKcYjujKH7woYyLj2sUmff0tRhrqM7BohUw7Pz3ZB1jj2gW9Fvmg==} + + '@webgpu/types@0.1.66': + resolution: {integrity: sha512-YA2hLrwLpDsRueNDXIMqN9NTzD6bCDkuXbOSe0heS+f8YE8usA6Gbv1prj81pzVHrbaAma7zObnIC+I6/sXJgA==} + + esbuild@0.18.20: + resolution: {integrity: sha512-ceqxoedUrcayh7Y7ZX6NdbbDzGROiyVBgC4PriJThBKSVPWnnFHZAkfI1lJT8QFkOwH4qOS2SJkS4wvpGl8BpA==} + engines: {node: '>=12'} + hasBin: true + + fsevents@2.3.3: + resolution: {integrity: sha512-5xoDfX+fL7faATnagmWPpbFtwh/R77WmMMqqHGS65C3vvB0YHrgF+B1YmZ3441tMj5n63k0212XNoJwzlhffQw==} + engines: {node: ^8.16.0 || ^10.6.0 || >=11.0.0} + os: [darwin] + + nanoid@3.3.11: + resolution: {integrity: sha512-N8SpfPUnUp1bK+PMYW8qSWdl9U+wwNWI4QKxOYDy9JAro3WMX7p2OeVRF9v+347pnakNevPmiHhNmZ2HbFA76w==} + engines: {node: ^10 || ^12 || ^13.7 || ^14 || >=15.0.1} + hasBin: true + + picocolors@1.1.1: + resolution: {integrity: sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA==} + + postcss@8.5.6: + resolution: {integrity: sha512-3Ybi1tAuwAP9s0r1UQ2J4n5Y0G05bJkpUIO0/bI9MhwmD70S5aTWbXGBwxHrelT+XM1k6dM0pk+SwNkpTRN7Pg==} + engines: {node: ^10 || ^12 || >=14} + + rollup@3.29.5: + resolution: {integrity: sha512-GVsDdsbJzzy4S/v3dqWPJ7EfvZJfCHiDqe80IyrF59LYuP+e6U1LJoUqeuqRbwAWoMNoXivMNeNAOf5E22VA1w==} + engines: {node: '>=14.18.0', npm: '>=8.0.0'} + hasBin: true + + source-map-js@1.2.1: + resolution: {integrity: sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA==} + engines: {node: '>=0.10.0'} + + tweakpane-plugin-file-import@0.2.1: + resolution: {integrity: sha512-8v9EFMKkyVXf5pu5xMdR8TAEBZ4qizMyd+1NN5fbdPN51KD28PFhDCbZcIvlDPPz05Y75MVhv0nJ7Cbk4Cbb5Q==} + peerDependencies: + tweakpane: ^3.1.4 + + tweakpane@3.1.10: + resolution: {integrity: sha512-rqwnl/pUa7+inhI2E9ayGTqqP0EPOOn/wVvSWjZsRbZUItzNShny7pzwL3hVlaN4m9t/aZhsP0aFQ9U5VVR2VQ==} + + typescript@5.9.3: + resolution: {integrity: sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==} + engines: {node: '>=14.17'} + hasBin: true + + vite-raw-plugin@1.0.2: + resolution: {integrity: sha512-gdp/OFVXBiVq1UwPujVb7+4mmgYHTGrzslMbQvxmgzTN4/HC+3j4GNrumsIKSWfA/y3hktII7XqY38muRaGjhw==} + + vite@4.5.14: + resolution: {integrity: sha512-+v57oAaoYNnO3hIu5Z/tJRZjq5aHM2zDve9YZ8HngVHbhk66RStobhb1sqPMIPEleV6cNKYK4eGrAbE9Ulbl2g==} + engines: {node: ^14.18.0 || >=16.0.0} + hasBin: true + peerDependencies: + '@types/node': '>= 14' + less: '*' + lightningcss: ^1.21.0 + sass: '*' + stylus: '*' + sugarss: '*' + terser: ^5.4.0 + peerDependenciesMeta: + '@types/node': + optional: true + less: + optional: true + lightningcss: + optional: true + sass: + optional: true + stylus: + optional: true + sugarss: + optional: true + terser: + optional: true + + wgpu-matrix@3.4.0: + resolution: {integrity: sha512-kXHrbAPKEn9A32Wf4wVldyx9MmnzwhuB5p8GCqoJP3ItU5+iDT4J3aTQwPZWkfb153hwGtqZtUwR2M+ipJKadg==} + +snapshots: + + '@esbuild/android-arm64@0.18.20': + optional: true + + '@esbuild/android-arm@0.18.20': + optional: true + + '@esbuild/android-x64@0.18.20': + optional: true + + '@esbuild/darwin-arm64@0.18.20': + optional: true + + '@esbuild/darwin-x64@0.18.20': + optional: true + + '@esbuild/freebsd-arm64@0.18.20': + optional: true + + '@esbuild/freebsd-x64@0.18.20': + optional: true + + '@esbuild/linux-arm64@0.18.20': + optional: true + + '@esbuild/linux-arm@0.18.20': + optional: true + + '@esbuild/linux-ia32@0.18.20': + optional: true + + '@esbuild/linux-loong64@0.18.20': + optional: true + + '@esbuild/linux-mips64el@0.18.20': + optional: true + + '@esbuild/linux-ppc64@0.18.20': + optional: true + + '@esbuild/linux-riscv64@0.18.20': + optional: true + + '@esbuild/linux-s390x@0.18.20': + optional: true + + '@esbuild/linux-x64@0.18.20': + optional: true + + '@esbuild/netbsd-x64@0.18.20': + optional: true + + '@esbuild/openbsd-x64@0.18.20': + optional: true + + '@esbuild/sunos-x64@0.18.20': + optional: true + + '@esbuild/win32-arm64@0.18.20': + optional: true + + '@esbuild/win32-ia32@0.18.20': + optional: true + + '@esbuild/win32-x64@0.18.20': + optional: true + + '@loaders.gl/core@4.3.4': + dependencies: + '@loaders.gl/loader-utils': 4.3.4(@loaders.gl/core@4.3.4) + '@loaders.gl/schema': 4.3.4(@loaders.gl/core@4.3.4) + '@loaders.gl/worker-utils': 4.3.4(@loaders.gl/core@4.3.4) + '@probe.gl/log': 4.1.0 + + '@loaders.gl/loader-utils@4.3.4(@loaders.gl/core@4.3.4)': + dependencies: + '@loaders.gl/core': 4.3.4 + '@loaders.gl/schema': 4.3.4(@loaders.gl/core@4.3.4) + '@loaders.gl/worker-utils': 4.3.4(@loaders.gl/core@4.3.4) + '@probe.gl/log': 4.1.0 + '@probe.gl/stats': 4.1.0 + + '@loaders.gl/ply@4.3.4(@loaders.gl/core@4.3.4)': + dependencies: + '@loaders.gl/core': 4.3.4 + '@loaders.gl/loader-utils': 4.3.4(@loaders.gl/core@4.3.4) + '@loaders.gl/schema': 4.3.4(@loaders.gl/core@4.3.4) + + '@loaders.gl/schema@4.3.4(@loaders.gl/core@4.3.4)': + dependencies: + '@loaders.gl/core': 4.3.4 + '@types/geojson': 7946.0.16 + + '@loaders.gl/worker-utils@4.3.4(@loaders.gl/core@4.3.4)': + dependencies: + '@loaders.gl/core': 4.3.4 + + '@petamoriken/float16@3.9.3': {} + + '@probe.gl/env@4.1.0': {} + + '@probe.gl/log@4.1.0': + dependencies: + '@probe.gl/env': 4.1.0 + + '@probe.gl/stats@4.1.0': {} + + '@tweakpane/core@1.1.9': {} + + '@types/geojson@7946.0.16': {} + + '@webgpu/types@0.1.66': {} + + esbuild@0.18.20: + optionalDependencies: + '@esbuild/android-arm': 0.18.20 + '@esbuild/android-arm64': 0.18.20 + '@esbuild/android-x64': 0.18.20 + '@esbuild/darwin-arm64': 0.18.20 + '@esbuild/darwin-x64': 0.18.20 + '@esbuild/freebsd-arm64': 0.18.20 + '@esbuild/freebsd-x64': 0.18.20 + '@esbuild/linux-arm': 0.18.20 + '@esbuild/linux-arm64': 0.18.20 + '@esbuild/linux-ia32': 0.18.20 + '@esbuild/linux-loong64': 0.18.20 + '@esbuild/linux-mips64el': 0.18.20 + '@esbuild/linux-ppc64': 0.18.20 + '@esbuild/linux-riscv64': 0.18.20 + '@esbuild/linux-s390x': 0.18.20 + '@esbuild/linux-x64': 0.18.20 + '@esbuild/netbsd-x64': 0.18.20 + '@esbuild/openbsd-x64': 0.18.20 + '@esbuild/sunos-x64': 0.18.20 + '@esbuild/win32-arm64': 0.18.20 + '@esbuild/win32-ia32': 0.18.20 + '@esbuild/win32-x64': 0.18.20 + + fsevents@2.3.3: + optional: true + + nanoid@3.3.11: {} + + picocolors@1.1.1: {} + + postcss@8.5.6: + dependencies: + nanoid: 3.3.11 + picocolors: 1.1.1 + source-map-js: 1.2.1 + + rollup@3.29.5: + optionalDependencies: + fsevents: 2.3.3 + + source-map-js@1.2.1: {} + + tweakpane-plugin-file-import@0.2.1(tweakpane@3.1.10): + dependencies: + tweakpane: 3.1.10 + + tweakpane@3.1.10: {} + + typescript@5.9.3: {} + + vite-raw-plugin@1.0.2: {} + + vite@4.5.14: + dependencies: + esbuild: 0.18.20 + postcss: 8.5.6 + rollup: 3.29.5 + optionalDependencies: + fsevents: 2.3.3 + + wgpu-matrix@3.4.0: {} diff --git a/pnpm-workspace.yaml b/pnpm-workspace.yaml new file mode 100644 index 0000000..efc037a --- /dev/null +++ b/pnpm-workspace.yaml @@ -0,0 +1,2 @@ +onlyBuiltDependencies: + - esbuild diff --git a/src/camera/camera.ts b/src/camera/camera.ts index 47ea1dc..93b6d2c 100644 --- a/src/camera/camera.ts +++ b/src/camera/camera.ts @@ -95,7 +95,7 @@ export async function load_camera_presets(file: string): Promise const c_size_vec2 = 4 * 2; const c_size_mat4 = 4 * 16; // byte size of mat4 (i.e. Float32Array(16)) -const c_size_camera_uniform = 4 * c_size_mat4 + 2 * c_size_vec2; +const c_size_camera_uniform = 4 * c_size_mat4 + 4 * c_size_vec2; interface CameraUniform { view_matrix: Mat4, view_inv_matrix: Mat4, @@ -178,6 +178,8 @@ export class Camera { offset += 2; intermediate_float_32_array.set(this.focal, offset); offset += 2; + intermediate_float_32_array.set(vec2.fromValues(this.fovX, this.fovY), offset); + offset += 2; this.device.queue.writeBuffer(this.uniform_buffer, 0, intermediate_float_32_array); } diff --git a/src/main.ts b/src/main.ts index 25efcb5..c7049b1 100644 --- a/src/main.ts +++ b/src/main.ts @@ -21,8 +21,9 @@ import { assert } from './utils/util'; const device = await adapter.requestDevice({ requiredLimits: { maxComputeWorkgroupStorageSize: adapter.limits.maxComputeWorkgroupStorageSize, - maxStorageBufferBindingSize: adapter.limits. maxStorageBufferBindingSize - }, + maxStorageBufferBindingSize: adapter.limits.maxStorageBufferBindingSize, + maxBufferSize: 2147483648 + }, }); const canvas = document.querySelector('#webgpu-canvas'); diff --git a/src/renderers/gaussian-renderer.ts b/src/renderers/gaussian-renderer.ts index 1684523..05c35f6 100644 --- a/src/renderers/gaussian-renderer.ts +++ b/src/renderers/gaussian-renderer.ts @@ -5,43 +5,129 @@ import { get_sorter,c_histogram_block_rows,C } from '../sort/sort'; import { Renderer } from './renderer'; export interface GaussianRenderer extends Renderer { - + setGaussianMultiplier: (multiplier: number) => void, } -// Utility to create GPU buffers -const createBuffer = ( - device: GPUDevice, - label: string, - size: number, - usage: GPUBufferUsageFlags, - data?: ArrayBuffer | ArrayBufferView -) => { - const buffer = device.createBuffer({ label, size, usage }); - if (data) device.queue.writeBuffer(buffer, 0, data); - return buffer; -}; - export default function get_renderer( pc: PointCloud, device: GPUDevice, presentation_format: GPUTextureFormat, camera_buffer: GPUBuffer, + sh_buffer: GPUBuffer, ): GaussianRenderer { - const sorter = get_sorter(pc.num_points, device); - + const sorter = get_sorter(pc.num_points, device); + // =============================================== // Initialize GPU Buffers // =============================================== - const nulling_data = new Uint32Array([0]); + const nullData = new Uint32Array([0]); + const nullDataBuffer = device.createBuffer({ + label: 'null data buffer', + size: nullData.byteLength, + usage: GPUBufferUsage.COPY_SRC | GPUBufferUsage.COPY_DST, + mappedAtCreation: true, + }); + new Uint32Array(nullDataBuffer.getMappedRange()).set(nullData); + nullDataBuffer.unmap(); + + const indirectData = new Uint32Array([6, pc.num_points, 0, 0]); + const indirectDrawBuffer = device.createBuffer({ + label: 'indirect draw buffer', + size: indirectData.byteLength, + usage: GPUBufferUsage.INDIRECT | GPUBufferUsage.COPY_DST, + mappedAtCreation: true, + }); + new Uint32Array(indirectDrawBuffer.getMappedRange()).set(indirectData); + indirectDrawBuffer.unmap(); + + const renderSettingsBuffer = device.createBuffer({ + label: 'render settings', + size: 8, + usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST, + }); + + const splatBuffer = device.createBuffer({ + label: 'splat buffer', + size: pc.num_points * 32 * 10, + usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST, + }); + + // =============================================== + // Create Bind Group Layouts + // =============================================== + + const camera_bind_group_layout = device.createBindGroupLayout({ + entries: [ + { binding: 0, visibility: GPUShaderStage.VERTEX | GPUShaderStage.COMPUTE, buffer: { type: "uniform" } } + ] + }); + + const gaussian_bind_group_layout = device.createBindGroupLayout({ + entries: [ + { binding: 0, visibility: GPUShaderStage.VERTEX | GPUShaderStage.COMPUTE, buffer: { type: "read-only-storage" } } + ] + }); + + const sort_bind_group_layout = device.createBindGroupLayout({ + entries: [ + { binding: 0, visibility: GPUShaderStage.COMPUTE, buffer: { type: "storage" } }, + { binding: 1, visibility: GPUShaderStage.COMPUTE, buffer: { type: "storage" } }, + { binding: 2, visibility: GPUShaderStage.COMPUTE, buffer: { type: "storage" } }, + { binding: 3, visibility: GPUShaderStage.COMPUTE, buffer: { type: "storage" } }, + ] + }); + + const sort_bind_group_layout_vertex = device.createBindGroupLayout({ + entries: [ + { binding: 0, visibility: GPUShaderStage.VERTEX, buffer: { type: "read-only-storage" } }, + { binding: 1, visibility: GPUShaderStage.VERTEX, buffer: { type: "read-only-storage" } }, + { binding: 2, visibility: GPUShaderStage.VERTEX, buffer: { type: "read-only-storage" } }, + { binding: 3, visibility: GPUShaderStage.VERTEX, buffer: { type: "read-only-storage" } }, + ] + }); + + const render_settings_bind_group_layout = device.createBindGroupLayout({ + entries: [ + { binding: 0, visibility: GPUShaderStage.VERTEX | GPUShaderStage.FRAGMENT, buffer: { type: "uniform" } } + ] + }); + + const camera_settings_bind_group_layout = device.createBindGroupLayout({ + entries: [ + { binding: 0, visibility: GPUShaderStage.COMPUTE | GPUShaderStage.VERTEX | GPUShaderStage.FRAGMENT, buffer: { type: "uniform" } }, + { binding: 1, visibility: GPUShaderStage.COMPUTE | GPUShaderStage.VERTEX, buffer: { type: "uniform" } } + ] + }); + + const splat_bind_group_layout_compute = device.createBindGroupLayout({ + entries: [ + { binding: 0, visibility: GPUShaderStage.COMPUTE, buffer: { type: "storage" } }, + { binding: 1, visibility: GPUShaderStage.COMPUTE, buffer: { type: "read-only-storage" } } + ] + }); + + const splat_bind_group_layout_vertex = device.createBindGroupLayout({ + entries: [ + { binding: 0, visibility: GPUShaderStage.VERTEX, buffer: { type: "read-only-storage" } } + ] + }); // =============================================== // Create Compute Pipeline and Bind Groups // =============================================== + const preprocess_pipeline = device.createComputePipeline({ label: 'preprocess', - layout: 'auto', + layout: device.createPipelineLayout({ + bindGroupLayouts: [ + camera_settings_bind_group_layout, + gaussian_bind_group_layout, + sort_bind_group_layout, + splat_bind_group_layout_compute, + ] + }), compute: { module: device.createShaderModule({ code: preprocessWGSL }), entryPoint: 'preprocess', @@ -54,7 +140,7 @@ export default function get_renderer( const sort_bind_group = device.createBindGroup({ label: 'sort', - layout: preprocess_pipeline.getBindGroupLayout(2), + layout: sort_bind_group_layout, entries: [ { binding: 0, resource: { buffer: sorter.sort_info_buffer } }, { binding: 1, resource: { buffer: sorter.ping_pong[0].sort_depths_buffer } }, @@ -63,24 +149,160 @@ export default function get_renderer( ], }); + const sort_bind_group_vertex = device.createBindGroup({ + label: 'sort vertex', + layout: sort_bind_group_layout_vertex, + entries: [ + { binding: 0, resource: { buffer: sorter.sort_info_buffer } }, + { binding: 1, resource: { buffer: sorter.ping_pong[0].sort_depths_buffer } }, + { binding: 2, resource: { buffer: sorter.ping_pong[0].sort_indices_buffer } }, + { binding: 3, resource: { buffer: sorter.sort_dispatch_indirect_buffer } }, + ], + }); // =============================================== // Create Render Pipeline and Bind Groups // =============================================== + const render_shader = device.createShaderModule({code: renderWGSL}); + const render_pipeline = device.createRenderPipeline({ + label: 'render', + layout: device.createPipelineLayout({ + bindGroupLayouts: [ + camera_settings_bind_group_layout, + gaussian_bind_group_layout, + sort_bind_group_layout_vertex, + splat_bind_group_layout_vertex, + ] + }), + vertex: { + module: render_shader, + entryPoint: 'vs_main', + }, + fragment: { + module: render_shader, + entryPoint: 'fs_main', + targets: [{ + format: presentation_format, + blend: { + color: { + srcFactor: 'src-alpha', + dstFactor: 'one-minus-src-alpha', + operation: 'add', + }, + alpha: { + srcFactor: 'src-alpha', + dstFactor: 'one-minus-src-alpha', + operation: 'add', + }, + }, + }], + }, + primitive: { + topology: 'triangle-list', + }, + }); - // =============================================== - // Command Encoder Functions - // =============================================== - + const camera_bind_group = device.createBindGroup({ + label: 'gaussian camera', + layout: camera_bind_group_layout, + entries: [{ binding: 0, resource: { buffer: camera_buffer } }], + }); + + const gaussian_bind_group = device.createBindGroup({ + label: 'gaussian data', + layout: gaussian_bind_group_layout, + entries: [{ binding: 0, resource: { buffer: pc.gaussian_3d_buffer } }], + }); + + const settings_bind_group = device.createBindGroup({ + label: 'render settings', + layout: render_settings_bind_group_layout, + entries: [{ binding: 0, resource: { buffer: renderSettingsBuffer } }], + }); + + const camera_settings_bind_group = device.createBindGroup({ + label: 'camera & settings', + layout: camera_settings_bind_group_layout, + entries: [{ binding: 0, resource: { buffer: camera_buffer } }, + { binding: 1, resource: { buffer: renderSettingsBuffer } }], + }); + + const splat_bind_group_compute = device.createBindGroup({ + label: 'splat data compute', + layout: splat_bind_group_layout_compute, + entries: [{ binding: 0, resource: { buffer: splatBuffer },}, + { binding: 1, resource: { buffer: sh_buffer }}], + }); + + const splat_bind_group_render = device.createBindGroup({ + label: 'splat data vertex', + layout: splat_bind_group_layout_vertex, + entries: [{ binding: 0, resource: { buffer: splatBuffer } }], + }); + + const render = (encoder: GPUCommandEncoder, texture_view: GPUTextureView) => { + encoder.copyBufferToBuffer(nullDataBuffer, 0, sorter.sort_info_buffer, 0, 4); + encoder.copyBufferToBuffer(nullDataBuffer, 0, sorter.sort_dispatch_indirect_buffer, 0, 4); + + const computePass = encoder.beginComputePass( + { label: 'preprocess pass' } + ); + computePass.setPipeline(preprocess_pipeline); + // maximun number of binding groups is 4, so merging uniforms + computePass.setBindGroup(0, camera_settings_bind_group); + computePass.setBindGroup(1, gaussian_bind_group); + computePass.setBindGroup(2, sort_bind_group); + computePass.setBindGroup(3, splat_bind_group_compute); + computePass.dispatchWorkgroups(Math.ceil(pc.num_points / C.histogram_wg_size)); + computePass.end(); + + // debug: check sort_infos.keys_size + const keysizeCheckBuffer = device.createBuffer({ + size: 4, + usage: GPUBufferUsage.COPY_DST | GPUBufferUsage.MAP_READ, + }); + encoder.copyBufferToBuffer(sorter.sort_info_buffer, 0, keysizeCheckBuffer, 0, 4); + device.queue.onSubmittedWorkDone().then(async () => { + await keysizeCheckBuffer.mapAsync(GPUMapMode.READ); + const array = new Uint32Array(keysizeCheckBuffer.getMappedRange()); + console.log('keysize:', array[0]); + keysizeCheckBuffer.unmap(); + }); + + // sorting + sorter.sort(encoder); + + encoder.copyBufferToBuffer(sorter.sort_info_buffer, 0, indirectDrawBuffer, 4, 4); + const pass = encoder.beginRenderPass({ + label: 'gaussian render', + colorAttachments: [ + { + view: texture_view, + loadOp: 'clear', + storeOp: 'store', + }, + ], + }); + pass.setPipeline(render_pipeline); + pass.setBindGroup(0, camera_settings_bind_group); + pass.setBindGroup(1, gaussian_bind_group); + pass.setBindGroup(2, sort_bind_group_vertex); + pass.setBindGroup(3, splat_bind_group_render); + pass.drawIndirect(indirectDrawBuffer, 0); + pass.end(); + }; // =============================================== // Return Render Object // =============================================== return { frame: (encoder: GPUCommandEncoder, texture_view: GPUTextureView) => { - sorter.sort(encoder); + render(encoder, texture_view); }, camera_buffer, + setGaussianMultiplier: (multiplier: number) => { + device.queue.writeBuffer(renderSettingsBuffer, 0, new Float32Array([multiplier, pc.sh_deg]) ); + }, }; } diff --git a/src/renderers/renderer.ts b/src/renderers/renderer.ts index ffdf9ba..92298fc 100644 --- a/src/renderers/renderer.ts +++ b/src/renderers/renderer.ts @@ -85,7 +85,8 @@ export default async function init( if (uploadedFile) { const pc = await load(uploadedFile, device); pointcloud_renderer = get_renderer_pointcloud(pc, device, presentation_format, camera.uniform_buffer); - gaussian_renderer = get_renderer_gaussian(pc, device, presentation_format, camera.uniform_buffer); + gaussian_renderer = get_renderer_gaussian(pc, device, presentation_format, camera.uniform_buffer, pc.sh_buffer); + gaussian_renderer.setGaussianMultiplier(params.gaussian_multiplier); renderers = { pointcloud: pointcloud_renderer, gaussian: gaussian_renderer, @@ -122,6 +123,9 @@ export default async function init( {min: 0, max: 1.5} ).on('change', (e) => { //TODO: Bind constants to the gaussian renderer. + if (gaussian_renderer) { + gaussian_renderer.setGaussianMultiplier(e.value); + } }); } @@ -147,6 +151,7 @@ export default async function init( function frame() { if (ply_file_loaded && cam_file_loaded) { params.fps=1.0/timeReturn()*1000.0; + console.log(`Time: ${timeReturn().toFixed(2)} ms`); time(); const encoder = device.createCommandEncoder(); const texture_view = context.getCurrentTexture().createView(); diff --git a/src/shaders/gaussian.wgsl b/src/shaders/gaussian.wgsl index 759226d..3787cc1 100644 --- a/src/shaders/gaussian.wgsl +++ b/src/shaders/gaussian.wgsl @@ -1,22 +1,89 @@ +struct CameraUniforms { + view: mat4x4, + view_inv: mat4x4, + proj: mat4x4, + proj_inv: mat4x4, + viewport: vec2, + focal: vec2, + fov: vec2, +}; + +struct Gaussian { + pos_opacity: array, + rot: array, + scale: array +}; + +struct RenderSettings { + gaussian_multiplier: f32, + sh_deg: f32, +} + +struct Splat { + //TODO: store information for 2D splat rendering + position: vec2, + radius: f32, + color: vec4, + conic: vec3, +}; + +@group(0) @binding(0) +var camera: CameraUniforms; +@group(0) @binding(1) +var render_settings : RenderSettings; +@group(1) @binding(0) +var gaussians : array; +@group(2) @binding(2) +var sort_indices : array; +@group(3) @binding(0) +var splats : array; + struct VertexOutput { @builtin(position) position: vec4, //TODO: information passed from vertex shader to fragment shader + @location(0) color: vec4, + @location(1) center: vec2, + @location(2) conic: vec3, }; -struct Splat { - //TODO: information defined in preprocess compute shader -}; +// strww @vertex fn vs_main( + @builtin(vertex_index) vertex_index: u32, + @builtin(instance_index) instance_index: u32 ) -> VertexOutput { //TODO: reconstruct 2D quad based on information from splat, pass + let splat_idx = sort_indices[instance_index]; + var out: VertexOutput; - out.position = vec4(1. ,1. , 0., 1.); + let quads = array,6>( + vec2(-1.0, -1.0), + vec2( 1.0, -1.0), + vec2( 1.0, 1.0), + vec2(-1.0, -1.0), + vec2( 1.0, 1.0), + vec2(-1.0, 1.0) + ); + + let quad_corner = quads[vertex_index]; + let splat = splats[splat_idx]; + let offset = vec2(quad_corner.x / camera.viewport.x, quad_corner.y / camera.viewport.y) * splat.radius; + out.position = vec4(splat.position + offset, 0.0, 1.0); + out.center = splat.position; + out.color = splat.color; + out.conic = splat.conic; return out; } @fragment fn fs_main(in: VertexOutput) -> @location(0) vec4 { - return vec4(1.); + let splat_center = (vec2(in.center.x, -in.center.y) + vec2(1.0, 1.0)) / 2.0 * camera.viewport; + let dis = in.position.xy - splat_center; + let value = in.conic.x * dis.x * dis.x + in.conic.y * dis.x * dis.y + in.conic.z * dis.y * dis.y; + if (value < 0.0 || value > 10.0) { + discard; + } + let alpha = in.color.a * exp(-0.5 * value); + return vec4(in.color.rgb, alpha); } \ No newline at end of file diff --git a/src/shaders/point_cloud.wgsl b/src/shaders/point_cloud.wgsl index 01dded1..e6f9a46 100644 --- a/src/shaders/point_cloud.wgsl +++ b/src/shaders/point_cloud.wgsl @@ -4,7 +4,8 @@ struct CameraUniforms { proj: mat4x4, proj_inv: mat4x4, viewport: vec2, - focal: vec2 + focal: vec2, + fov: vec2, }; struct Gaussian { @@ -35,7 +36,7 @@ fn vs_main( let pos = vec4(a.x, a.y, b.x, 1.); // TODO: MVP calculations - out.position = pos; + out.position = camera.proj * camera.view * pos; return out; } diff --git a/src/shaders/preprocess.wgsl b/src/shaders/preprocess.wgsl index bbc63f5..4dccb3d 100644 --- a/src/shaders/preprocess.wgsl +++ b/src/shaders/preprocess.wgsl @@ -41,11 +41,12 @@ struct CameraUniforms { proj: mat4x4, proj_inv: mat4x4, viewport: vec2, - focal: vec2 + focal: vec2, + fov: vec2, }; struct RenderSettings { - gaussian_scaling: f32, + gaussian_multiplier: f32, sh_deg: f32, } @@ -57,9 +58,19 @@ struct Gaussian { struct Splat { //TODO: store information for 2D splat rendering + position: vec2, + radius: f32, + color: vec4, + conic: vec3, }; //TODO: bind your data here +@group(0) @binding(0) +var camera: CameraUniforms; +@group(0) @binding(1) +var render_settings : RenderSettings; +@group(1) @binding(0) +var gaussians : array; @group(2) @binding(0) var sort_infos: SortInfos; @group(2) @binding(1) @@ -68,11 +79,23 @@ var sort_depths : array; var sort_indices : array; @group(2) @binding(3) var sort_dispatch: DispatchIndirect; +@group(3) @binding(0) +var splats : array; +@group(3) @binding(1) +var sh_coeffs : array; /// reads the ith sh coef from the storage buffer fn sh_coef(splat_idx: u32, c_idx: u32) -> vec3 { //TODO: access your binded sh_coeff, see load.ts for how it is stored - return vec3(0.0); + if (c_idx % 2u == 0u) { + let a = unpack2x16float(sh_coeffs[splat_idx * 24u + c_idx / 2u * 3u]); + let b = unpack2x16float(sh_coeffs[splat_idx * 24u + c_idx / 2u * 3u + 1u]); + return vec3(a.x, a.y, b.x); + } else { + let a = unpack2x16float(sh_coeffs[splat_idx * 24u + c_idx / 2u * 3u + 1u]); + let b = unpack2x16float(sh_coeffs[splat_idx * 24u + c_idx / 2u * 3u + 2u]); + return vec3(a.y, b.x, b.y); + } } // spherical harmonics evaluation with Condon–Shortley phase @@ -108,11 +131,122 @@ fn computeColorFromSH(dir: vec3, v_idx: u32, sh_deg: u32) -> vec3 { return max(vec3(0.), result); } +// from https://github.com/graphdeco-inria/diff-gaussian-rasterization/blob/main/cuda_rasterizer/forward.cu#L216 +fn computeCov2D(mean: vec3 , focal_x: f32, focal_y: f32, tan_fovx: f32, tan_fovy: f32, cov3D: mat3x3) -> mat2x2 { + // The following models the steps outlined by equations 29 + // and 31 in "EWA Splatting" (Zwicker et al., 2002). + // Additionally considers aspect / scaling of viewport. + // Transposes used to account for row-/column-major conventions. + var t = camera.view * vec4(mean, 1.0); + + let limx = 1.3f * tan_fovx; + let limy = 1.3f * tan_fovy; + let txtz = t.x / t.z; + let tytz = t.y / t.z; + t.x = min(limx, max(-limx, txtz)) * t.z; + t.y = min(limy, max(-limy, tytz)) * t.z; + + let J = mat3x3( + focal_x / t.z, 0.0f, -(focal_x * t.x) / (t.z * t.z), + 0.0f, focal_y / t.z, -(focal_y * t.y) / (t.z * t.z), + 0, 0, 0); + + let W = mat3x3( + camera.view[0][0], camera.view[0][1], camera.view[0][2], + camera.view[1][0], camera.view[1][1], camera.view[1][2], + camera.view[2][0], camera.view[2][1], camera.view[2][2] + ); + let T = W * J; + + var cov = transpose(T) * transpose(cov3D) * T; + + // Apply low-pass filter: every Gaussian should be at least + // one pixel wide/high. Discard 3rd row and column. + cov[0][0] += 0.3f; + cov[1][1] += 0.3f; + return mat2x2(cov[0][0], cov[0][1], cov[1][0], cov[1][1]); +} + @compute @workgroup_size(workgroupSize,1,1) fn preprocess(@builtin(global_invocation_id) gid: vec3, @builtin(num_workgroups) wgs: vec3) { let idx = gid.x; //TODO: set up pipeline as described in instruction - let keys_per_dispatch = workgroupSize * sortKeyPerThread; - // increment DispatchIndirect.dispatchx each time you reach limit for one dispatch of keys + if (idx >= arrayLength(&gaussians)) { + return; + } + + let vertex = gaussians[idx]; + let a = unpack2x16float(vertex.pos_opacity[0]); + let b = unpack2x16float(vertex.pos_opacity[1]); + let c = unpack2x16float(vertex.rot[0]); + let d = unpack2x16float(vertex.rot[1]); + let e = unpack2x16float(vertex.scale[0]); + let f = unpack2x16float(vertex.scale[1]); + let world_pos = vec3(a.x, a.y, b.x); + let opacity = 1.0/(1.0+exp(-b.y)); // decode opacity from sigmoid space + let rot = vec4(c.x, c.y, d.x, d.y); + let scale = exp(vec3(e.x, e.y, f.x)); // scale is stored in log space + var ndc_pos = camera.proj * camera.view * vec4(world_pos, 1.0); + ndc_pos /= ndc_pos.w; + + if (ndc_pos.x > -1.2 && ndc_pos.x < 1.2 && ndc_pos.y > -1.2 && ndc_pos.y < 1.2 && ndc_pos.z > 0.0 && ndc_pos.z < 1.0) { + let qr = rot.x; + let qx = rot.y; + let qy = rot.z; + let qz = rot.w; + let R = mat3x3( + 1.0 - 2.0 * (qy * qy + qz * qz), 2.0 * (qx * qy - qr * qz), 2.0 * (qx * qz + qr * qy), + 2.0 * (qx * qy + qr * qz), 1.0 - 2.0 * (qx * qx + qz * qz), 2.0 * (qy * qz - qr * qx), + 2.0 * (qx * qz - qr * qy), 2.0 * (qy * qz + qr * qx), 1.0 - 2.0 * (qx * qx + qy * qy) + ); + let S = mat3x3( + vec3(render_settings.gaussian_multiplier * scale.x, 0.0, 0.0), + vec3(0.0, render_settings.gaussian_multiplier * scale.y, 0.0), + vec3(0.0, 0.0, render_settings.gaussian_multiplier * scale.z) + ); + let cov3D = transpose(S * R) * S * R; + + let cov2D = computeCov2D(world_pos, camera.focal.x, camera.focal.y, tan(camera.fov.x), tan(camera.fov.y), cov3D); + + // from https://github.com/graphdeco-inria/diff-gaussian-rasterization/blob/main/cuda_rasterizer/forward.cu#L216 + // Invert covariance (EWA algorithm) + let det = (cov2D[0][0] * cov2D[1][1] - cov2D[0][1] * cov2D[1][0]); + if (det == 0.0f) { + return; + } + let det_inv = 1.f / det; + let conic = mat2x2( + cov2D[1][1] * det_inv, -cov2D[0][1] * det_inv, + -cov2D[1][0] * det_inv, cov2D[0][0] * det_inv + ); + + // Compute extent in screen space (by finding eigenvalues of + // 2D covariance matrix). Use extent to compute a bounding rectangle + // of screen-space tiles that this Gaussian overlaps with. Quit if + // rectangle covers 0 tiles. + let mid = 0.5f * (cov2D[0][0] + cov2D[1][1]); + let lambda1 = mid + sqrt(max(0.1f, mid * mid - det)); + let lambda2 = mid - sqrt(max(0.1f, mid * mid - det)); + let radius = ceil(3.f * sqrt(max(lambda1, lambda2))); + + // compute color from SH + let cam_pos = -camera.view[3].xyz; + let dir = normalize(world_pos - cam_pos); + let color = computeColorFromSH(dir, idx, u32(render_settings.sh_deg)); + + let splat_idx = atomicAdd(&sort_infos.keys_size, 1u); + splats[splat_idx].position = ndc_pos.xy; + splats[splat_idx].radius = radius; + splats[splat_idx].color = vec4(color, opacity); + splats[splat_idx].conic = vec3(conic[0][0], conic[0][1], conic[1][1]); + + sort_depths[splat_idx] = u32(clamp(1.0-ndc_pos.z, 0.0, 1.0) * f32(0xFFFFFFFFu)); + sort_indices[splat_idx] = splat_idx; + let keys_per_dispatch = workgroupSize * sortKeyPerThread; + // increment DispatchIndirect.dispatchx each time you reach limit for one dispatch of keys + if (splat_idx % keys_per_dispatch == 0u) { + atomicAdd(&sort_dispatch.dispatch_x, 1u); + } + } } \ No newline at end of file