WAI-NERF-VIZ

A real-time, interactive web-based 3D visualization tool for WAI/Nerfstudio datasets.

This tool allows researchers to inspect preprocessing results-specifically RGB images coupled with depth maps. It projects 2D RGB-D data into an interactive 3D point cloud environment, complete with camera frustums, accessible directly via a web browser.

Primary Use Case: Verifying the geometric consistency of WAI/Nerfstudio structured dataset that include depth maps.

Click to view 6 example results

Table of Contents

WAI-NERF-VIZ

Overview

This tool was created to visualize and analyze the results of preprocessing datasets in the WAI format, particularly those with depth maps generated by mvsanywhere model. It provides an interactive 3D visualization using Viser, allowing users to inspect point clouds generated from RGB-D data and examine camera frustums.

The tool has been primarily developed and tested on the DL3DV-10K dataset. Other datasets may require adjustments to parameters or data loading logic depending on their specific structure and characteristics.

Much of the processing logic and approach was inspired by the map-anything model from Facebook Research.

Features

Multi-format Depth Map Support: Handles depth maps in .exr, .png, and .npy formats with automatic format detection and fallback mechanisms
Interactive 3D Visualization: Real-time web-based visualization using Viser, accessible through any modern web browser
Point Cloud Generation: Automatic unprojection of RGB-D data into colored 3D point clouds using pinhole camera model
Camera Frustum Visualization: Visual representation of camera poses and viewing frustums for each frame
Real-time Parameter Adjustment: Interactive GUI controls for adjusting depth scale, point size, and maximum depth filtering
Frame Sampling: Configurable frame skipping for processing large datasets efficiently
Modular Architecture: Clean separation of concerns with dedicated modules for data loading, geometry processing, and visualization
Remote Server Support: Designed for remote server access with port forwarding capabilities

Dataset Structure (WAI)

This tool expects datasets in the WAI format with the following structure:

.
├── scene_meta.json                    # Contains intrinsics and extrinsics
├── images/                            # Contains .jpg images
├── mvsanywhere/
│   └── v0/
│       ├── depth/                      # Contains depth maps (e.g., frame_00001.exr)
│       └── depth_confidence/          # Contains depth confidence maps (e.g., frame_00001.exr)
└── scene_meta_distorted.json          # Optional: distorted scene metadata

The tool reads camera intrinsics and extrinsics from scene_meta.json, RGB images from the images/ directory, and depth maps from mvsanywhere/v0/depth/. Depth maps can be in .exr, .png, or .npy formats.

Input

The tool takes as input:

RGB Images: Color images (.jpg format) from the dataset's images/ directory
Depth Maps: Per-frame depth maps (.exr, .png, or .npy format) from the mvsanywhere/v0/depth/ directory
Camera Parameters: Intrinsic parameters (focal length, principal point) and extrinsic parameters (camera-to-world transformation matrices) from scene_meta.json

Processing

The tool performs the following operations:

Data Loading: Reads RGB images, depth maps, and camera metadata from the dataset structure
Point Cloud Generation: Unprojects depth maps to 3D points using pinhole camera model, combining depth values with RGB colors
Coordinate Transformation: Transforms points from camera coordinates to world coordinates using the provided camera extrinsics
Scene Assembly: Aggregates point clouds from multiple frames into a unified 3D scene

Output

The tool produces:

Interactive 3D Visualization: An interactive web-based visualization server (via Viser) accessible through a browser
Point Cloud Visualization: An interactive colored 3D point cloud representing the scene geometry, generated from the RGB-D data
Camera Frustum Visualization: Interactive visual representations of camera poses and viewing frustums for each frame
Interactive Controls: Real-time GUI controls for adjusting visualization parameters:
- Interactive depth scale adjustment
- Interactive point size modification
- Interactive maximum depth filtering

Installation

Install the package in editable mode:

pip install -e .

Or install dependencies directly:

pip install -r requirements.txt

For EXR depth map support, you may also need to install OpenEXR:

pip install OpenEXR

Usage

Run the CLI tool pointing to your dataset root:

python -m wai_nerf_viz.cli --dataset_root /path/to/dataset [options]

Or use the console script (if installed):

wai-nerf-viz --dataset_root /path/to/dataset [options]

Remote Access

If you are running this on a remote server (e.g., via SSH), forward the port to your local machine:

On Server:

wai-nerf-viz --dataset_root ./data/my_scene --port 8080

On Local Machine:

ssh -L 8080:localhost:8080 user@remote-server

Open Browser: Go to http://localhost:8080

CLI Arguments

The following command-line arguments are available:

--dataset_root: Path to dataset root directory (required)
--port: Port for Viser server (default: 8080)
--host: Host for Viser server (default: 0.0.0.0)
--frame_skip: Process every Nth frame (default: 1, process all frames)
--downsample: Downsample factor for point cloud (default: 4)
--default_depth_scale: Default depth scale (default: 1.0)
--default_max_depth: Default max depth cutoff (default: 100.0)
--default_point_size: Default point size (default: 0.01)
--log_filepath: (default: None)
--debug: (default: False)

Example

python -m wai_nerf_viz.cli --dataset_root ./my_dataset --port 8080 --downsample 4 --frame_skip 5

This will start a Viser server on port 8080, processing every 5th frame and downsampling the point cloud by a factor of 4.

Interactive Controls

Once the visualization is running, you can use the following GUI controls:

Depth Scale: Adjust the scaling factor applied to depth values
Point Size: Control the size of points in the point cloud
Max Depth: Set the maximum depth cutoff for filtering points

These controls update the visualization in real-time, allowing you to explore different parameter settings.

Examples

Example visualizations and results can be found in the examples/ directory.

DL3DV-10K Dataset Results

Results from visualizing the DL3DV-10K dataset preprocessing:

Notes

This tool has been primarily tested on the DL3DV-10K dataset. Other datasets may require parameter adjustments or modifications to the data loading logic.
The depth map loading supports multiple formats (.exr, .png, .npy) with fallback mechanisms for EXR files.
For remote server access, use port forwarding in your SSH configuration to access the Viser server from your local machine.

Acknowledgments

This project was inspired by and builds upon work from the map-anything model by Facebook Research, particularly in the context of processing the DL3DV-10K dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
examples		examples
wai-nerf-viz		wai-nerf-viz
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WAI-NERF-VIZ

Overview

Features

Dataset Structure (WAI)

Input

Processing

Output

Installation

Usage

Remote Access

CLI Arguments

Example

Interactive Controls

Examples

DL3DV-10K Dataset Results

Notes

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

yehonatanke/WAI-Viser

Folders and files

Latest commit

History

Repository files navigation

WAI-NERF-VIZ

Overview

Features

Dataset Structure (WAI)

Input

Processing

Output

Installation

Usage

Remote Access

CLI Arguments

Example

Interactive Controls

Examples

DL3DV-10K Dataset Results

Notes

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages