基于遮挡视频实例分割的裸眼3D实现

项目简介

本项目基于GenVIS模型实现了裸眼3D的效果，使用模型为OVIS训练集上训练的模型。

在原项目的基础上，修改部分代码使其在Python3.10版本中运行。

关于裸眼3D的部分，主要代码分为以下部分：

demo/autostereoscopy.py文件中实现视频的输入和输出。
demo/predictor.py文件中增加裸眼3D的适配。
demo/visualizer.py文件中实现方法draw_autostereoscopy，用于绘制裸眼3D的必要边框。

项目效果展示点击此处。

环境配置

原项目环境配置可参考：installation instructions.

我使用的环境为docker，配置方式如下：

docker pull pytorch/pytorch:2.2.0-cuda12.1-cudnn8-devel

docker run --name pytorch2 --gpus all --privileged -v $PWD/conda:/home/condashare -dt -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all --shm-size 8G pytorch/pytorch:2.2.0-cuda12.1-cudnn8-devel

在这里我是将Windows上的一个目录挂载到容器中，后续需要在这个文件夹中克隆仓库，这样方便查看输出。

之后按照原项目的配置方式即可，注意python版本的区别，在后续需要克隆detectron2仓库，该项目的setup.cfg文件中使用的版本为3.7，需要修改为3.10。

运行方式

我使用的视频分辨率为1920x1080。

使用R50为backbone的模型，运行方式如下：

CUDA_VISIBLE_DEVICES=0 python demo/autostereoscopy.py --config-file configs/genvis/ovis/genvis_R50_bs8_online.yaml --video-input /path/to/video --output /path/to/output --opts MODEL.WEIGHTS /path/to/checkpoint_file

使用Swin-L为backbone的模型，运行方式如下(其实改一下config和模型路径就行)：

CUDA_VISIBLE_DEVICES=0 python demo/autostereoscopy.py --config-file configs/genvis/ovis/genvis_SWIN_bs8_online.yaml --video-input /path/to/video --output /path/to/output--save-frames true --opts MODEL.WEIGHTS /path/to/checkpoint_file

这里需要给出三个路径：视频路径、输出文件夹路径和模型路径。

测试了这两个模型，使用Swin-L的效果更好。

以下为原项目的部分README。

Getting Started

We provide a script train_net_genvis.py, that is made to train all the configs provided in GenVIS.

To train a model with "train_net_genvis.py" on VIS, first setup the corresponding datasets following Preparing Datasets.

Then run with pretrained weights on target VIS dataset in VITA's Model Zoo:

python train_net_genvis.py --num-gpus 4 \
  --config-file configs/genvis/ovis/genvis_R50_bs8_online.yaml \
  MODEL.WEIGHTS vita_r50_ovis.pth

To evaluate a model's performance, use

python train_net_genvis.py --num-gpus 4 \
  --config-file configs/genvis/ovis/genvis_R50_bs8_online.yaml \
  --eval-only MODEL.WEIGHTS /path/to/checkpoint_file

Model Zoo

Additional weights will be updated soon!

YouTubeVIS-2019

Backbone	Method	AP	AP50	AP75	AR1	AR10	Download
R-50	online	50.0	71.5	54.6	49.5	59.7	model
R-50	semi-online	51.3	72.0	57.8	49.5	60.0	model
Swin-L	online	64.0	84.9	68.3	56.1	69.4	model
Swin-L	semi-online	63.8	85.7	68.5	56.3	68.4	model

YouTubeVIS-2021

Backbone	Method	AP	AP50	AP75	AR1	AR10	Download
R-50	online	47.1	67.5	51.5	41.6	54.7	model
R-50	semi-online	46.3	67.0	50.2	40.6	53.2	model
Swin-L	online	59.6	80.9	65.8	48.7	65.0	model
Swin-L	semi-online	60.1	80.9	66.5	49.1	64.7	model

OVIS

Backbone	Method	AP	AP50	AP75	AR1	AR10	Download
R-50	online	35.8	60.8	36.2	16.3	39.6	model
R-50	semi-online	34.5	59.4	35.0	16.6	38.3	model
Swin-L	online	45.2	69.1	48.4	19.1	48.6	model
Swin-L	semi-online	45.4	69.2	47.8	18.9	49.0	model

License

The majority of GenVIS is licensed under a Apache-2.0 License. However portions of the project are available under separate license terms: Detectron2(Apache-2.0 License), IFC(Apache-2.0 License), Mask2Former(MIT License), Deformable-DETR(Apache-2.0 License), and VITA(Apache-2.0 License).

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
configs		configs
demo		demo
genvis		genvis
mask2former		mask2former
vita		vita
.gitignore		.gitignore
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
train_net_genvis.py		train_net_genvis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

基于遮挡视频实例分割的裸眼3D实现

项目简介

环境配置

运行方式

Getting Started

Model Zoo

YouTubeVIS-2019

YouTubeVIS-2021

OVIS

License

About

Releases

Packages

Languages

License

nkufree/autostereoscopy

Folders and files

Latest commit

History

Repository files navigation

基于遮挡视频实例分割的裸眼3D实现

项目简介

环境配置

运行方式

Getting Started

Model Zoo

YouTubeVIS-2019

YouTubeVIS-2021

OVIS

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages