This is a Tensorflow re-implementation of Luo, W., & Schwing, A. G. (n.d.). Efficient Deep Learning for Stereo Matching. (https://www.cs.toronto.edu/~urtasun/publications/luo_etal_cvpr16.pdf)
data
└───kitti_2015
│─── training
|───image_2
|───000000_10.png
|───000001_10.png
|─── ...
|───image_3
|───disp_noc_0
|─── ...
│─── testing
|───image_2
|───image_3
python main.py --dataset kitti_2015 --patch-size 37 --disparity-range 201
- After training for 40k iterations.
- Qualitative results on validation set.
- 3-pixel error evaluation on validation set.
Example input images
Disparity Ground-truth
- Cost-aggregation
Without cost-aggregation
With cost-aggregation
A closer look to observe the smoothing of predictions, without cost aggregation and with respectively:
-
To compare with results reported in paper, look at Table-5, column
Ours(37)
.3-pixel error (%) baseline (paper) 7.13 baseline (re-implementation) 7.271 baseline + CA (paper) 6.58 baseline + CA (re-implementation) 6.527
- Implement post processing to smoothen output.
- Look into error metrics and do quantitative analysis.
- Run inference on test video sequences.
- Instead of the batch matrix multiplication during inference, which constructs a
B x H x W x W
tensor, use a loop to compute cost volume over the disparity range. Tensorflow VM might figure out that it should parallelise operations over the loop.