Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate distance field based on depth image by using render_spherical function #72

Open
yuchenrao opened this issue Apr 20, 2021 · 2 comments

Comments

@yuchenrao
Copy link

Hi,

Thanks a lot for sharing the code!

I am planning to generate Truncated Signed Distance Field based on your depth image. Based on my understanding, I think I can get tdf from depth image here:

tdf = depth_to_mesh_df(depth_im, th, False, 1.0, 2.2)

Is that correct?

Another question is that do I need to change camera distance based on corresponding XML files in the above function? Or 2.2 is good enough? Since here:

t.camera.position = np.array([-cam_dist, 0, 0])
seems need the corresponding camera pose for rendering.

I use 2.2 as the camera distance for now, and tdf has very small values. I try to visualize the tdf in voxel grids. The first image is an example for values smaller than 0.05, and the second image is an example for values smaller than 0.005, the third image is the depth image. I think the the second image seems reasonable to me, but could you explain why the values are so small?

Thanks a lot for your help!

Screen Shot 2021-04-20 at 4 34 06 PM

Screen Shot 2021-04-20 at 4 33 29 PM
03001627_b2c62e5b20b34fad5844a4d0ab925627_view001_depth

@ztzhang
Copy link
Collaborator

ztzhang commented Apr 23, 2021

The camera distance should be consistent with the XML files, as we assume the distance functions are discretized in a unit cube in global space, since all the models are normalized.

The values are distances in metric space, therefore usually on a magnitude of 1 / voxel_size.

@yuchenrao
Copy link
Author

yuchenrao commented Apr 23, 2021

Thanks a lot for your quick reply!

Here are my understanding based on your answers, please correct me if I am wrong. So in order to get correct tdf:

  1. Get a depth image (depth_im) form the dataset and its corresponding camera pose in the world_frame (camera_pose)
  2. Use util_sph.render_spherical(data, mask), and pass camera_pose to tdf = depth_to_mesh_df(depth_im, th, False, 1.0, camera_pose), then replace t.camera.position = camera_pose
  3. final_tdf = tdf * voxel_size

Does this seem correct?
One more question is that how can I get voxel_size? By using (depth_max - depth_min) / 128?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants