IEEE J Biomed Health Inform
Published: October 2024
The self-supervised monocular depth estimation framework is well-suited for medical images that lack ground-truth depth, such as those from digestive endoscopes, facilitating navigation and 3D reconstruction in the gastrointestinal tract. However, this framework faces several limitations, including poor performance in low-texture environments, limited generalisation to real-world datasets, and unclear applicability in downstream tasks like visual servoing. To tackle these challenges, we propose MonoLoT, a self-supervised monocular depth estimation framework featuring two key innovations: point matching loss and batch image shuffle. Extensive ablation studies on two publicly available datasets, namely C3VD and SimCol, have shown that methods enabled by MonoLoT achieve substantial improvements, with accuracies of 0.944 on C3VD and 0.959 on SimCol, surpassing both depth-supervised and self-supervised baselines on C3VD. Qualitative evaluations on real-world endoscopic data underscore the generalisation capabilities of our methods, outperforming both depth-supervised and self-supervised baselines. To demonstrate the feasibility of using monocular depth estimation for visual servoing, we have successfully integrated our method into a proof-of-concept robotic platform, enabling real-time automatic intervention and control in digestive endoscopy. In summary, our method represents a significant advancement in monocular depth estimation for digestive endoscopy, overcoming key challenges and opening promising avenues for medical applications.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/JBHI.2024.3423791 | DOI Listing |
Sensors (Basel)
December 2024
Institute of Computer and Communication Engineering, Department of Electrical Engineering, National Cheng Kung University, Tainan 701, Taiwan.
Precision depth estimation plays a key role in many applications, including 3D scene reconstruction, virtual reality, autonomous driving and human-computer interaction. Through recent advancements in deep learning technologies, monocular depth estimation, with its simplicity, has surpassed the traditional stereo camera systems, bringing new possibilities in 3D sensing. In this paper, by using a single camera, we propose an end-to-end supervised monocular depth estimation autoencoder, which contains an encoder with a structure with a mixed convolution neural network and vision transformers and an effective adaptive fusion decoder to obtain high-precision depth maps.
View Article and Find Full Text PDFJ Neurosci
January 2025
Department of Psychology, 450 Jane Stanford Way, Stanford University, Stanford, CA, USA.
Immaturities exist at multiple levels of the developing human visual pathway, starting with immaturities in photon efficiency and spatial sampling in the retina and on through immaturities in early and later stages of cortical processing. Here we use Steady-State Visual Evoked Potentials (SSVEPs) and controlled visual stimuli to determine the degree to which sensitivity to horizontal retinal disparity is limited by the visibility of the monocular half-images, the ability to encode absolute disparity or the ability to encode relative disparity. Responses were recorded from male and female human participants at average ages of 5.
View Article and Find Full Text PDFJ Neurosci
January 2025
Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland.
Our visual system enables us to effortlessly navigate and recognize real-world visual environments. Functional magnetic resonance imaging (fMRI) studies suggest a network of scene-responsive cortical visual areas, but much less is known about the temporal order in which different scene properties are analysed by the human visual system. In this study, we selected a set of 36 full-colour natural scenes that varied in spatial structure and semantic content that our male and female human participants viewed both in 2D and 3D while we recorded magnetoencephalography (MEG) data.
View Article and Find Full Text PDFSensors (Basel)
December 2024
Electrical, Computer, and Biomedical Engineering, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada.
Autonomous technologies have revolutionized transportation, military operations, and space exploration, necessitating precise localization in environments where traditional GPS-based systems are unreliable or unavailable. While widespread for outdoor localization, GPS systems face limitations in obstructed environments such as dense urban areas, forests, and indoor spaces. Moreover, GPS reliance introduces vulnerabilities to signal disruptions, which can lead to significant operational failures.
View Article and Find Full Text PDFLight Sci Appl
January 2025
Innovation Center for Advanced Medical Imaging and Intelligent Medicine, Guangzhou Institute of Technology, Xidian University, Guangzhou, 510555, Guangdong, China.
A novel monocular depth-sensing camera based on meta-imaging sensor technology has been developed, offering more precise depth sensing with millimeter-level accuracy and enhanced robustness compared to conventional 2D and light-field cameras.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!
© LitMetric 2025. All rights reserved.