Single image depth estimation works fail to separate foreground elements because they can easily be confounded with the background. To alleviate this problem, we propose the use of a semantic segmentation procedure that adds information to a depth estimator, in this case, a 3D Convolutional Neural Network (CNN)-segmentation is coded as one-hot planes representing categories of objects. We explore 2D and 3D models.
View Article and Find Full Text PDF