Skip to content

wngTn/synthetic_dcp

Repository files navigation

Point-Set Alignment Using Weak Labels

With the advent of recent technologies, multi-view RGB-D recordings have become the prevalent way of data acquisition in the operating room (OR). The significant domain gap between standard data and OR data requires methods that are capable of effectively generalizing to this unique and challenging data domain. Therefore, previous works have established methods to leverage 3D information to detect faces in an OR multi-view RGB-D setting. These methods rely on point set registrations; however, real-world 3D point clouds are often noisy and incomplete, which may yield erroneous alignments using existing point set registration methods. In this project, we aim to address this issue by adapting a deep learning-based point-set registration method to achieve more robust rigid transformations on real-world data. We perform quantitative as well as qualitative evaluations of our proposed method and also give an outlook for future improvements.

For details see the full project proposal.

This project was conducted as part of the 2022/23 Machine Learning for 3D Geometry course (IN2392) at the Technical University of Munich.

Problem

To following shows the real-world point cloud data of an OR (see "Operation Room Data" below for details on this data). In the scene, a DL-based detector has identified a person and placed a SMPL mesh at the estimated position. As one can see the estimation of the head is not very good and needs refinement to be usable for face detection. In this project, different algorithms are compared in terms of their performance in computing a rotation and translation matrix that better aligns the head with the person in the OR.

Visualization of the task

Synthetic Dataset

DCP is originally trained on the ModelNet40 dataset, which deviates too much from our medical setting. Therefore, to train the DCP architecture, we create a synthetic dataset to imitate the real OR data as closely as possible.

The parameters for synthetically created SMPL poses can be found here (128MB). You may place it under data/smpl_training_poses.pkl

It has over 90k different poses sampled from the HumanAct12 dataset.

The following GIF visualizes 100 of these poses (code used to create it)

We augment the SMPL meshes and accessories to further mimic the real data. The mesh is cropped around the head as this is the part we want to do the alignment on. We sample points on the meshes, add some noise and use the point cloud as input to the model.

Results

In the following, the red mesh is the point cloud we are trying to align to the recorded 3D scene. The head is rendered into the scene as predicted by coarse full-body detection.

After rigid alignment with DCP

For more quantitative as well as qualitative results see the full project report or the supplementary materials.

We evaluated the performance of DCP and FilterReg both on real-world and synthetic data.

Synthetic Data

  • DCP-v2 outperformed DCP-v1
  • DCP trained on our synthetic data outperformed DCP trained on ModelNet40
  • synthetically trained DCP-v2 yields about equal performance as FilterReg

Real-world Data

  • FilterReg performs a lot better than DCP
  • DCP doesn't seem to work well with real-world imperfections (other architectures like PRNet might be promising for further research)

Dependencies

pip install -r requirements.txt

You may want to use a virtual environment + dependency manager (e.g use Conda)

To install PyTorch3D see https://github.com/facebookresearch/pytorch3d/blob/main/INSTALL.md

SMPL Models

The SMPL models can be downloaded from: https://smpl.is.tue.mpg.de

They are to be placed at ./data/smpl_models/

We did not create them nor hold intellectual property on them. We use them under the provided license for non-commercial scientific purposes as granted under:

Therefore, we adhere to their demand not to publish/distribute their code/model. To execute the SMPL model, one could be interested in the code provided in EasyMocap. Notice the emptiness of ./lib/smplmodel/ in this project.

Please review and comply with the license requirements of the SMPL authors as stated on their linked webpage before using any code.

Operation Room Data

As the data captured in the OR is the proprietary property of the chair of Computer Aided Medical Procedures (CAMP) at TUM, we are not able to publish it here.

If you are in legal possession you may download and save the data under ./data/trial. The 2D face bounding box annotations are available at ./data/gt as these are part of our contribution.

Without this data, you are not able to test the synthetically trained model on real-world data. You can run python ./demo/visualize_pointcloud.py to see a single frame as a demo and get a feel for what the data looks like. The rest of the project is working as is.

Contributors

Further Work

PRNet is a Partial-to-Partial Registration DL-based approach. We deemed it might be suitable for our application, as our real-world measurements are also only partial but we were not able to reach reasonable results with it.

The code of PRnet is included in this project and may be subject to further experiments.

As the detection algorithm that provides the first estimate on an alignment is also DL-based it should be possible to create a pipeline that is end-to-end trainable.

References

This project is based on the following work, we hereby acknowledge the intellectual property used in this project. We tagged the parts of the code where their work was used directly or in refactored form. For further references see the full project report.

About

The GOAT of ML 3D Projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published