Previous work has shown that adversarial learning can be used for unsupervised monocular depth and visual odometry (VO) estimation, in which the adversarial loss and the geometric image reconstruction loss are utilized as the mainly supervisory signals to train the whole unsupervised framework. However, the performance of the adversarial framework and image reconstruction is usually limited by occlusions and the visual field changes between the frames. This article proposes a masked generative adversarial network (GAN) for unsupervised monocular depth and ego-motion estimations. The MaskNet and Boolean mask scheme are designed in this framework to eliminate the effects of occlusions and impacts of visual field changes on the reconstruction loss and adversarial loss, respectively. Furthermore, we also consider the scale consistency of our pose network by utilizing a new scale-consistency loss, and therefore, our pose network is capable of providing the full camera trajectory over a long monocular sequence. Extensive experiments on the KITTI data set show that each component proposed in this article contributes to the performance, and both our depth and trajectory predictions achieve competitive performance on the KITTI and Make3D data sets.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2020.3044181DOI Listing

Publication Analysis

Top Keywords

gan unsupervised
8
scale consistency
8
unsupervised monocular
8
monocular depth
8
adversarial loss
8
image reconstruction
8
reconstruction loss
8
visual field
8
field changes
8
pose network
8

Similar Publications

Introduction to Artificial Intelligence (AI) and Machine Learning (ML) in Pathology & Medicine: Generative & Non-Generative AI Basics.

Mod Pathol

January 2025

Department of Pathology, University of Pittsburgh Medical Center, PA, USA; Computational Pathology and AI Center of Excellence (CPACE), University of Pittsburgh School of Medicine, Pittsburgh, PA, USA. Electronic address:

This manuscript serves as an introduction to a comprehensive seven-part review article series on artificial intelligence (AI) and machine learning (ML) and their current and future influence within pathology and medicine. This introductory review provides a comprehensive grasp of this fast-expanding realm and its potential to transform medical diagnosis, workflow, research, and education. Fundamental terminology employed in AI-ML is covered using an extensive dictionary.

View Article and Find Full Text PDF

Generative Adversarial Networks (GANs) have emerged as a powerful tool in artificial intelligence, particularly for unsupervised learning. This systematic review analyzes GAN applications in healthcare, focusing on image and signal-based studies across various clinical domains. Following Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines, we reviewed 72 relevant journal articles.

View Article and Find Full Text PDF

Hyperspectral remote sensing images obtained from cameras are characterized by high-dimensions and low quality, which makes them unfavorable for various analytics purposes. This is due to the presence of visible and invisible frequencies of the reflected light making it poorly reveal the spectral signatures of the image. Visual communication advancement has paved the need for Image Super-Resolution (SR) which recovers high-resolution images from low-resolution images.

View Article and Find Full Text PDF
Article Synopsis
  • Pneumonia is a significant health issue in children, often diagnosed with low-dose pediatric chest X-rays, which can miss cases due to bone interference in the images.
  • Existing deep learning methods for bone suppression in adult X-rays struggle with pediatric images due to a lack of labeled data, while dual-energy techniques are rarely used in pediatrics and traditional methods have limitations.
  • The study presents a novel method that automates the labeling of pediatric chest X-ray images to enhance bone suppression networks without needing specialized equipment or extensive training.
View Article and Find Full Text PDF

RECONSTRUCTING RETINAL VISUAL IMAGES FROM 3T FMRI DATA ENHANCED BY UNSUPERVISED LEARNING.

Proc IEEE Int Symp Biomed Imaging

May 2024

School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA.

Article Synopsis
  • The study focuses on using brain activity data from fMRI to reconstruct human visual inputs, aiming to better understand the visual system.
  • Despite advancements in deep learning for visual reconstruction, there is a need for high-quality, long-duration fMRI scans at 7-Tesla, which are currently scarce.
  • To address this, the authors propose a new framework that uses a Generative Adversarial Network (GAN) to create improved 3-Tesla fMRI data from unpaired datasets, successfully demonstrating enhanced image reconstruction capabilities.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!