Depth completion which aims at predicting dense depth maps from sparse depth measurements, plays a crucial role in many computer graphics and computer vision applications. Previous supervised learning based approaches have demonstrated overwhelming success in this task, while unsupervised high-precision depth completion without relying on the ground-truth data still remains challenging. One main drawback of most previous unsupervised solutions comes from the ignorance of 3D structural information, which often leads to inaccurate spatial propagation and mixed-depth problems. To alleviate the above challenges, this paper explores the utilization of 3D perceptual features and multi-view geometry consistency to devise a high-precision self-supervised depth completion method. Our key contribution is a 3D perceptual spatial propagation constructed with a point cloud representation and an attention weighting mechanism, to capture more reasonable and favorable neighbors during the depth propagation process. Based on the 3D perceptual spatial propagation, we also introduce multi-view geometric constraints between adjacent views to guide the optimization of the whole depth completion model, which achieves geometry consistent depth completion in a self-supervised manner. Extensive experiments on benchmark datasets of NYU-Depth-v2, VOID and KITTI Depth Completion demonstrate that the proposed model achieves the state-of-the-art depth completion performance compared with other unsupervised methods, and even competitive performance compared with previous supervised methods.
Cai et al. (Thu,) studied this question.