Unsupervised monocular depth estimation plays a vital role for endoscopy-based minimally invasive surgery (MIS). However, it remains challenging due to the distinctive imaging characteristics of endoscopy which disrupt the assumption of photometric consistency, a foundation relied upon by conventional methods. Distinct from recent approaches taking image pre-processing strategy, this paper introduces a pioneering solution through intrinsic image decomposition (IID) theory. Specifically, we propose a novel end-to-end intrinsic-based unsupervised monocular depth learning framework that is comprised of an image intrinsic decomposition module and a synthesis reconstruction module. This framework seamlessly integrates IID with unsupervised monocular depth estimation, and dedicated losses are meticulously designed to offer robust supervision for network training based on this novel integration. Noteworthy, we rely on the favorable property of the resulting albedo map of IID to circumvent the challenging images characteristics instead of pre-processing the input frames. The proposed method is extensively validated on SCARED and Hamlyn datasets, and better results are obtained than state-of-the-art techniques. Beside, its generalization ability and the effectiveness of the proposed components are also validated. This innovative method has the potential to elevate the quality of 3D reconstruction in monocular endoscopy, thereby enhancing the accuracy and robustness of augmented reality navigation technology in MIS. Our code will be available at: https://github.com/bobo909/IID-SfmLearner.

Download full-text PDF

Source
http://dx.doi.org/10.1109/JBHI.2024.3400804DOI Listing

Publication Analysis

Top Keywords

unsupervised monocular
16
monocular depth
16
depth estimation
12
intrinsic-based unsupervised
8
monocular
5
image
4
image intrinsic-based
4
unsupervised
4
depth
4
estimation endoscopy
4

Similar Publications

Which perceptual categories do observers experience during multistable perception?

bioRxiv

September 2024

Department of Psychology, Northeastern University, 360 Huntington Ave, Boston, Massachusetts, 02115, United States of America.

Multistable perceptual phenomena provide insights into the mind's dynamic states within a stable external environment and the neural underpinnings of these consciousness changes are often studied with binocular rivalry. Conventional methods to study binocular rivalry suffer from biases and assumptions that limit their ability to describe the continuous nature of this perceptual transitions and to discover what kind of percept was perceived across time. In this study, we propose a novel way to avoid those shortcomings by combining a continuous psychophysical method that estimates introspection during binocular rivalry with machine learning clustering and transition probability analysis.

View Article and Find Full Text PDF

Image stitching is a traditional but challenging computer vision task. The goal is to stitch together multiple images with overlapping areas into a single, natural-looking, high-resolution image without ghosts or seams. This article aims to increase the field of view of gastroenteroscopy and reduce the missed detection rate.

View Article and Find Full Text PDF

Depth estimation from monocular endoscopy using simulation and image transfer approach.

Comput Biol Med

October 2024

Department of Health Sciences and Technology, GAIHST, Gachon University, Incheon, 21999, South Korea; Department of Biomedical Engineering, Gachon University, Seongnam, 13120, South Korea. Electronic address:

Obtaining accurate distance or depth information in endoscopy is crucial for the effective utilization of navigation systems. However, due to space constraints, incorporating depth cameras into endoscopic systems is often impractical. Our goal is to estimate depth images directly from endoscopic images using deep learning.

View Article and Find Full Text PDF
Article Synopsis
  • Existing panoramic layout estimation methods face challenges such as imprecise boundary recovery due to vertical compression and the need for extensive, time-consuming data annotations.
  • The proposed DOPNet introduces an orthogonal plane disentanglement approach to improve the clarity and accuracy of room layouts, consisting of three integrated modules that enhance the quality of output representations.
  • To address the data annotation issue, an unsupervised adaptation technique is presented, utilizing optimization strategies and a 1D cost volume method to leverage geometric consistency and enrich scene information from multiple perspectives, leading to superior performance in layout estimation tasks compared to current state-of-the-art models.
View Article and Find Full Text PDF
Article Synopsis
  • Recent research in unsupervised monocular depth estimation for intelligent vehicles has mostly focused on ground vehicles, overlooking its application to unmanned aerial vehicles (UAVs).
  • A new method is proposed to estimate absolute depth in flight scenes using a monocular camera, employing a geometry-based scale recovery algorithm for improved scale consistency.
  • Experiments show that this method, which is effective in various conditions and resistant to sensor noise, expands depth estimation capabilities for UAVs, providing valuable depth information for future applications.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!