Raw depth images captured in indoor scenarios frequently exhibit extensive missing values due to the inherent limitations of the sensors and environments. For example, transparent materials frequently elude detection by depth sensors; surfaces may introduce measurement inaccuracies due to their polished textures, extended distances, and oblique incidence angles from the sensor. The presence of incomplete depth maps imposes significant challenges for subsequent vision applications, prompting the development of numerous depth completion techniques to mitigate this problem. Numerous methods excel at reconstructing dense depth maps from sparse samples, but they often falter when faced with extensive contiguous regions of missing depth values, a prevalent and critical challenge in indoor environments. To overcome these challenges, we design a novel two-branch end-to-end fusion network named RDFC-GAN, which takes a pair of RGB and incomplete depth images as input to predict a dense and completed depth map. The first branch employs an encoder-decoder structure, by adhering to the Manhattan world assumption and utilizing normal maps from RGB-D information as guidance, to regress the local dense depth values from the raw depth map. The other branch applies an RGB-depth fusion CycleGAN, adept at translating RGB imagery into detailed, textured depth maps while ensuring high fidelity through cycle consistency. We fuse the two branches via adaptive fusion modules named W-AdaIN and train the model with the help of pseudo depth maps. Comprehensive evaluations on NYU-Depth V2 and SUN RGB-D datasets show that our method significantly enhances depth completion performance particularly in realistic indoor settings.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2024.3388004 | DOI Listing |
Sensors (Basel)
December 2024
Key Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, China.
Six degrees of freedom (6-DoF) object pose estimation is essential for robotic grasping and autonomous driving. While estimating pose from a single RGB image is highly desirable for real-world applications, it presents significant challenges. Many approaches incorporate supplementary information, such as depth data, to derive valuable geometric characteristics.
View Article and Find Full Text PDFSensors (Basel)
December 2024
Institute of Computer and Communication Engineering, Department of Electrical Engineering, National Cheng Kung University, Tainan 701, Taiwan.
Precision depth estimation plays a key role in many applications, including 3D scene reconstruction, virtual reality, autonomous driving and human-computer interaction. Through recent advancements in deep learning technologies, monocular depth estimation, with its simplicity, has surpassed the traditional stereo camera systems, bringing new possibilities in 3D sensing. In this paper, by using a single camera, we propose an end-to-end supervised monocular depth estimation autoencoder, which contains an encoder with a structure with a mixed convolution neural network and vision transformers and an effective adaptive fusion decoder to obtain high-precision depth maps.
View Article and Find Full Text PDFSensors (Basel)
December 2024
School of Physics and Astronomy, University of Glasgow, Glasgow G12 8QQ, UK.
Accurate depth estimation is crucial for many fields, including robotics, navigation, and medical imaging. However, conventional depth sensors often produce low-resolution (LR) depth maps, making detailed scene perception challenging. To address this, enhancing LR depth maps to high-resolution (HR) ones has become essential, guided by HR-structured inputs like RGB or grayscale images.
View Article and Find Full Text PDFJMIR Form Res
January 2025
Big Data Convergence and Open Sharing System, Seoul National University, Seoul, Republic of Korea.
Background: The rapid proliferation of artificial intelligence (AI) requires new approaches for human-AI interfaces that are different from classic human-computer interfaces. In developing a system that is conducive to the analysis and use of health big data (HBD), reflecting the empirical characteristics of users who have performed HBD analysis is the most crucial aspect to consider. Recently, human-centered design methodology, a field of user-centered design, has been expanded and is used not only to develop types of products but also technologies and services.
View Article and Find Full Text PDFSci Rep
January 2025
Department of Materials Science, Case Western Reserve University, Cleveland, 44106, USA.
Understanding subsurface temperature variations is crucial for assessing material degradation in underground structures. This study maps subsurface temperatures across the contiguous United States for depths from 50 to 3500 m, comparing linear interpolation, gradient boosting (LightGBM), neural networks, and a novel hybrid approach combining linear interpolation with LightGBM. Results reveal heterogeneous temperature patterns both horizontally and vertically.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!