Skip to content

A visualization tool for analyzing preprocessing results of WAI/Nerfstudio format datasets containing RGB images and corresponding depth maps to generate interactive 3D visualizations.

License

Notifications You must be signed in to change notification settings

yehonatanke/WAI-Viser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WAI-NERF-VIZ

Python Version License Viser

A real-time, interactive web-based 3D visualization tool for WAI/Nerfstudio datasets.

This tool allows researchers to inspect preprocessing results-specifically RGB images coupled with depth maps. It projects 2D RGB-D data into an interactive 3D point cloud environment, complete with camera frustums, accessible directly via a web browser.

Primary Use Case: Verifying the geometric consistency of WAI/Nerfstudio structured dataset that include depth maps.


Click to view 6 example results
Table of Contents

Overview

This tool was created to visualize and analyze the results of preprocessing datasets in the WAI format, particularly those with depth maps generated by mvsanywhere model. It provides an interactive 3D visualization using Viser, allowing users to inspect point clouds generated from RGB-D data and examine camera frustums.

The tool has been primarily developed and tested on the DL3DV-10K dataset. Other datasets may require adjustments to parameters or data loading logic depending on their specific structure and characteristics.

Much of the processing logic and approach was inspired by the map-anything model from Facebook Research.

Features

  • Multi-format Depth Map Support: Handles depth maps in .exr, .png, and .npy formats with automatic format detection and fallback mechanisms
  • Interactive 3D Visualization: Real-time web-based visualization using Viser, accessible through any modern web browser
  • Point Cloud Generation: Automatic unprojection of RGB-D data into colored 3D point clouds using pinhole camera model
  • Camera Frustum Visualization: Visual representation of camera poses and viewing frustums for each frame
  • Real-time Parameter Adjustment: Interactive GUI controls for adjusting depth scale, point size, and maximum depth filtering
  • Frame Sampling: Configurable frame skipping for processing large datasets efficiently
  • Modular Architecture: Clean separation of concerns with dedicated modules for data loading, geometry processing, and visualization
  • Remote Server Support: Designed for remote server access with port forwarding capabilities

Dataset Structure (WAI)

This tool expects datasets in the WAI format with the following structure:

.
├── scene_meta.json                    # Contains intrinsics and extrinsics
├── images/                            # Contains .jpg images
├── mvsanywhere/
│   └── v0/
│       ├── depth/                      # Contains depth maps (e.g., frame_00001.exr)
│       └── depth_confidence/          # Contains depth confidence maps (e.g., frame_00001.exr)
└── scene_meta_distorted.json          # Optional: distorted scene metadata

The tool reads camera intrinsics and extrinsics from scene_meta.json, RGB images from the images/ directory, and depth maps from mvsanywhere/v0/depth/. Depth maps can be in .exr, .png, or .npy formats.

Input

The tool takes as input:

  • RGB Images: Color images (.jpg format) from the dataset's images/ directory
  • Depth Maps: Per-frame depth maps (.exr, .png, or .npy format) from the mvsanywhere/v0/depth/ directory
  • Camera Parameters: Intrinsic parameters (focal length, principal point) and extrinsic parameters (camera-to-world transformation matrices) from scene_meta.json

Processing

The tool performs the following operations:

  1. Data Loading: Reads RGB images, depth maps, and camera metadata from the dataset structure
  2. Point Cloud Generation: Unprojects depth maps to 3D points using pinhole camera model, combining depth values with RGB colors
  3. Coordinate Transformation: Transforms points from camera coordinates to world coordinates using the provided camera extrinsics
  4. Scene Assembly: Aggregates point clouds from multiple frames into a unified 3D scene

Output

The tool produces:

  • Interactive 3D Visualization: An interactive web-based visualization server (via Viser) accessible through a browser
  • Point Cloud Visualization: An interactive colored 3D point cloud representing the scene geometry, generated from the RGB-D data
  • Camera Frustum Visualization: Interactive visual representations of camera poses and viewing frustums for each frame
  • Interactive Controls: Real-time GUI controls for adjusting visualization parameters:
    • Interactive depth scale adjustment
    • Interactive point size modification
    • Interactive maximum depth filtering

Installation

Install the package in editable mode:

pip install -e .

Or install dependencies directly:

pip install -r requirements.txt

For EXR depth map support, you may also need to install OpenEXR:

pip install OpenEXR

Usage

Run the CLI tool pointing to your dataset root:

python -m wai_nerf_viz.cli --dataset_root /path/to/dataset [options]

Or use the console script (if installed):

wai-nerf-viz --dataset_root /path/to/dataset [options]

Remote Access

If you are running this on a remote server (e.g., via SSH), forward the port to your local machine:

  1. On Server:
    wai-nerf-viz --dataset_root ./data/my_scene --port 8080
  2. On Local Machine:
    ssh -L 8080:localhost:8080 user@remote-server

Open Browser: Go to http://localhost:8080

CLI Arguments

The following command-line arguments are available:

  • --dataset_root: Path to dataset root directory (required)
  • --port: Port for Viser server (default: 8080)
  • --host: Host for Viser server (default: 0.0.0.0)
  • --frame_skip: Process every Nth frame (default: 1, process all frames)
  • --downsample: Downsample factor for point cloud (default: 4)
  • --default_depth_scale: Default depth scale (default: 1.0)
  • --default_max_depth: Default max depth cutoff (default: 100.0)
  • --default_point_size: Default point size (default: 0.01)
  • --log_filepath: (default: None)
  • --debug: (default: False)

Example

python -m wai_nerf_viz.cli --dataset_root ./my_dataset --port 8080 --downsample 4 --frame_skip 5

This will start a Viser server on port 8080, processing every 5th frame and downsampling the point cloud by a factor of 4.

Interactive Controls

Once the visualization is running, you can use the following GUI controls:

  • Depth Scale: Adjust the scaling factor applied to depth values
  • Point Size: Control the size of points in the point cloud
  • Max Depth: Set the maximum depth cutoff for filtering points

These controls update the visualization in real-time, allowing you to explore different parameter settings.

Examples

Example visualizations and results can be found in the examples/ directory.

DL3DV-10K Dataset Results

Results from visualizing the DL3DV-10K dataset preprocessing:

Notes

  • This tool has been primarily tested on the DL3DV-10K dataset. Other datasets may require parameter adjustments or modifications to the data loading logic.
  • The depth map loading supports multiple formats (.exr, .png, .npy) with fallback mechanisms for EXR files.
  • For remote server access, use port forwarding in your SSH configuration to access the Viser server from your local machine.

Acknowledgments

This project was inspired by and builds upon work from the map-anything model by Facebook Research, particularly in the context of processing the DL3DV-10K dataset.

About

A visualization tool for analyzing preprocessing results of WAI/Nerfstudio format datasets containing RGB images and corresponding depth maps to generate interactive 3D visualizations.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages