Different prediction results by using different data loader code for GenRe #73

yuchenrao · 2021-04-23T09:38:18Z

Hi, I am trying to duplicate your results in the GenRe paper, but I got much worse performance by evaluating with Pix3d method you mentioned before. Could you give me some advice about it?

Before everything, I would like to mentioned that the outputs for test samples seem pretty good to me.

The data loader scripts for training and testing are different. They use different masks (RGB v.s. GRAY) and different preprocessing methods, and here is an example about the different prediction results for the same input (rgb + silhou): yellow: label, grey: prediction based on training structure, orange: prediction based on testing structure. Could you let me know about which data loader and preprocessing should I use for evaluation?
Form the issues, I notice that a lot of people meet problems on evaluation, could you release the specific evaluation code here rather than point to pix3d evaluation methods? Since modifying evaluation codes may also introduce some errors.

Thanks a lot!

ztzhang · 2021-04-23T17:42:10Z

re 1: The the loaded mask during training is further preprocessed by this function here:

GenRe-ShapeHD/models/marrnetbase.py

Line 104 in ee42add

elif key == 'silhou':

, which converts it into a single channel mask, so there shouldn't be any difference during training & testing, at least for the masking operation.
re 2: the original evaluation code is written with an even more outdated version of pytorch used in this repo, which relied on custom cuda kernels for Chamfer distance. What the code did was:

take the raw output of the network, pass it through sigmoid for normalization, and treat it as predicted soft voxels.
specify a list of threshold, usually in the range from 0.3 to 0.5.
for each threshold, do marching cubes for both predicted soft voxels and gt surface voxels, turning them into meshes.
sample 1024 points on each mesh using trimesh
calculate chamfer distance for those points.

For this repo, we reported numbers on pix3d dataset using the eval code from pix3d.

yuchenrao · 2021-04-23T18:34:16Z

Thanks a lot for your quick reply!

re-re 1:
I think for testing and training, the use different pre-processing methods, correct me if I am wrong:

For training process, it uses this function directly:

GenRe-ShapeHD/models/genre_full_model.py

Line 87 in ee42add

def preprocess(cls, data, mode='train'):
For testing process, it uses this wrapper, which crops the data, and it also make sense that the two results looks similar but have some dimension offsets. I attach another example which shows more clear:

GenRe-ShapeHD/models/genre_full_model.py

Line 160 in ee42add

def preprocess_wrapper(cls, in_dict):

re-re 2:
Thanks a lot for your explanation! I will try to follow your steps and see how It works. Could you also explain more about:

For step 2, I notice that the default threshold is 0.1 in Pixel, do you have any preference for which threshold I should choose? Or I need to try it and pick the best one?
For step 3, you mentioned gt surface voxels, do you mean the file with name *_gt_rotvox_samescale_128.npz?

Thanks a lot!

yuchenrao · 2021-04-25T21:07:48Z

@ztzhang Hi~ Here are some updates from my side. Could you check it when you have time? Thanks a lot!

I use shapenet_data_loader method for loading data (use model.preprocess rather than model.preprocess_wrapper)
I use the evaluation method from pix3D, and here is a part of the code (in genre_full_model.Model_test)

    def test_on_batch(self, batch_i, batch, use_trimesh=True):
        outdir = join(self.output_dir, 'batch%04d' % batch_i)
        makedirs(outdir, exist_ok=True)
        pred = self.predict(batch, load_gt=False, no_grad=True) # not use trimesh

        output = self.pack_output(pred, batch, add_gt=False)
        self.visualizer.visualize(output, batch_i, outdir)
        np.savez(outdir + '.npz', **output)

        # calculate CD
        pred_vox = output['pred_voxel'][0][0]
        pred_vox = self.sigmoid(pred_vox)
        # get gt voxel
        file1 = batch['rgb_path'][0][:-7] + 'gt_rotvox_samescale_128.npz'
        with np.load(file1) as data:
            val = data['voxel']
            val = np.transpose(val, (0, 2, 1))
            val = np.flip(val, 2)
            voxel_surface = val - binary_erosion(val, structure=np.ones((3, 3, 3)), iterations=2).astype(float)
            voxel_surface = voxel_surface[None, ...]
            voxel_surface = np.clip(voxel_surface, 0, 1)
            gt_vox = voxel_surface[0]
        pred_pts = self.get_pts(pred_vox, 0.4, 1024) # tried 0.3, 0.4, 0.5, the results are showed in 3
        gt_pts = self.get_pts(gt_vox, 0.4, 1024)

        cd_d = nndistance_score(torch.from_numpy(pred_pts).cuda().unsqueeze(0).float(), torch.from_numpy(gt_pts).cuda().unsqueeze(0).float()) # nndistance in toolbox
    
    def get_pts(self, pred_vox, threshold, pts_size):
        empty_voxel = False
        if pred_vox.max() < threshold:
            # dummy isosurface
            empty_voxel = True
            points = np.zeros((pts_size, 3))
        else:
            points = self.get_surface_points(pred_vox, threshold, pts_size) # same function in pix3d
        if not empty_voxel:
            bound_l = np.min(points, axis=0)
            bound_h = np.max(points, axis=0)
            points = points - (bound_l + bound_h) / 2
            points = points / (bound_h - bound_l).max()
        return points

Here is an example for pred_vox (red) v.s. gt_vox (green), which is not aligned with each other, do I need to use camera pose to transfer it?

3. I tested on genre_full_model and here are the results with different thresholds for different classes, the value for paper is from the Table 1 in GenRe paper, and the results have very big differences.

4. I also tried to evaluate pix3d data, which seems not correct to me:
1). I change the code to resize the image to 256 both vertically and horizontally
2). Here is the result for pred_vox (green) v.s. gt_vox (blue) for 0019.png.

Do you have any ideas about what's going wrong? Thank you very much!

ztzhang mentioned this issue Apr 23, 2021

Evaluation code #67

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different prediction results by using different data loader code for GenRe #73

Different prediction results by using different data loader code for GenRe #73

yuchenrao commented Apr 23, 2021

ztzhang commented Apr 23, 2021 •

edited

Loading

yuchenrao commented Apr 23, 2021 •

edited

Loading

yuchenrao commented Apr 25, 2021 •

edited

Loading

Different prediction results by using different data loader code for GenRe #73

Different prediction results by using different data loader code for GenRe #73

Comments

yuchenrao commented Apr 23, 2021

ztzhang commented Apr 23, 2021 • edited Loading

yuchenrao commented Apr 23, 2021 • edited Loading

yuchenrao commented Apr 25, 2021 • edited Loading

ztzhang commented Apr 23, 2021 •

edited

Loading

yuchenrao commented Apr 23, 2021 •

edited

Loading

yuchenrao commented Apr 25, 2021 •

edited

Loading