It's a fascinating idea that we can reconstruct an entire scene given a set of views, and in this project we explore two such ways: one is the classical Structure from Motion (SfM) and the other is the learning-based Neural Radiance Fields (NeRF).
Structure from motion helps reconstruct a scene from two or more different views using epipolar geometry. A few years ago, Agarwal et. al published Building Rome in a Day in which they reconstructed the entire city just by using a large collection of photos from the internet. In this phase, we take a detailed step-by-step approach from feature matching to obtaining camera locations to reconstructing the scene.
Neural Radiance Fields (NeRF) is an innovative approach to view synthesis that pushed the boundaries of computer vision and graphics. NeRF's underlying neural network architecture models the radiance and geometry of a scene taking sparse image set and camera poses as inputs, enabling it to generate novel views. In this phase, we build a vanilla NeRF model (with modifications to work on a lighter GPU) from the original paper by Mildenhall et al.
Download the data required for the package (Lego dataset) from here. Since the package has already been downloaded - from the package's root directory, use the following commands for training/inference respectively:
python phase2/code/train.py
python phase2/code/test.py
Use --help
argument for further explanation on the arguments.
This is a brief summary of the results obtained. Details of the training regime, along with the network architecture, are given below:
- Dataset: Lego Dataset (resized to 100x100)
- Optimizer: Adam optimizer and learning rate of 5e-4
- Number of samples per ray: 64
- Mini-batch size: 5000 (number of query points to be evaluated in one pass)
- Near and far planes: 2.0 - 6.0
- Number of encoding frequencies: 10 position freq, 4 direction freq
Train-Val Loss |
---|
Training and Validation loss over epochs |
PSNR | Result |
---|---|
- Mildenhall, B., et al. (2020). NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In Proceedings of the European Conference on Computer Vision (ECCV).
- https://www.scratchapixel.com/lessons/3d-basic-rendering/ray-tracing-generating-camera-rays/generating-camera-rays.html
- https://github.com/yenchenlin/nerf-pytorch
- https://colab.research.google.com/drive/1TppdSsLz8uKoNwqJqDGg8se8BHQcvg_K?usp=sharing