Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference on custom synthetic renders failing #17

Open
maturk opened this issue Jan 8, 2023 · 1 comment
Open

Inference on custom synthetic renders failing #17

maturk opened this issue Jan 8, 2023 · 1 comment

Comments

@maturk
Copy link

maturk commented Jan 8, 2023

Hi @zubair-irshad,

I am trying to run some evaluations on my own dataset of renders of Synthetic ShapeNet models (same models you trained on); but, I am failing to run the inference script. It looks like the object detection pipeline fails (heat map outputs) and the resulting point cloud reconstruction and bounding boxes are wrong. The renders contain only a single object in the middle. Here are my input color and depth images:
0_color
0_depth

And here is the output from the inference script:

Peaks_output:
2_peaks_output

Bounding Box output:
box3d2

Point cloud projection output:
projection2

Let me know if you have any ideas how to get inference to work on these types of synthetic renders. Many thanks!

Matias

@zubair-irshad
Copy link
Owner

zubair-irshad commented Jan 13, 2023

Hi @maturk,

Thanks for your interest in our work. It could be worth looking into the following few things:

  1. What are the camera intrinsic of the shapenet renderings. Do they match or are they close to the camera used to render NOCS synthetic? Please also see FAQ.1 here and if it could be helpful.

  2. In what form is the depth input to the network? I presume your depth is object-centric and not scene-centric and there could be a difference in how we have trained the model and how you might be performing inference. Please see the image below on the scene-depth we use as an input to the model. This could be found under camera_composed_depths here. This is what the original NOCS dataset provided and we perform training/inference in such a way as to reduce the sim2real gap since the real depth is usually scene-centric and not object-centric.

Note that you may train your model on your data from scratch (highly recommended) but since you are interested in zero-shot inference, it would be good to test the model on data which is matching the training distribution.

  1. Which checkpoint are you using to perform inference? Please note that the checkpoints we have released only work for real scenes and maybe sub optimal for synthetic scenes (i.e. in the following notebook)

image

Hope it helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants