August 2019
tl;dr: Mono 3DOD based on 3D and 2D consistency, in particular landmark and shape recon.
The paper is written in overcomplicated math formulation. Overall not very impressive. The consistency part is quite similar to other papers such as deep3dbox.
The morphable wire frame model is fragile and the authors did not do a thorough ablation study on its contribution. I am not sure if shape recon is a good idea, especially to handle corner cases. --> Nobody in the literature actually talks about how to handle corner cases. This need to be acquired through engineering practice. Maybe CV method is needed to handle the corner cases.
The paper seems to use 3D depth off the shelf but it was not described in details.
- Learn a morphable wire model from landmarks (takes 2.5 min, deterministic). --> similar to ROI 10D.
- Metrics: ALP (average localization precision). This metric only cares about center location.
- w and h are encoded to be exponential forms because they need to be positive.
- Where does the label come from?
- The wireframe model is fragile and cannot model under-represented cases.
- von Mises distribution: circular Gaussian distribution