IEEE Trans Image Process
October 2024
Deep CNNs have achieved impressive improvements for night-time self-supervised depth estimation form a monocular image. However, the performance degrades considerably compared to day-time depth estimation due to significant domain gaps, low visibility, and varying illuminations between day and night images. To address these challenges, we propose a novel night-time self-supervised monocular depth estimation framework with structure regularization, i.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
August 2024
Dichotomous image segmentation (DIS) with rich fine-grained details within a single image is a challenging task. Despite the plausible results achieved by deep learning-based methods, most of them fail to segment generic objects when the boundary is cluttered with the background. In fact, the gradual decrease in feature map resolution during the encoding stage and the misleading texture clue may be the main issues.
View Article and Find Full Text PDFIEEE Trans Image Process
August 2024
Multi-view clustering aims to learn discriminative representations from multi-view data. Although existing methods show impressive performance by leveraging contrastive learning to tackle the representation gap between every two views, they share the common limitation of not performing semantic alignment from a global perspective, resulting in the undermining of semantic patterns in multi-view data. This paper presents CSOT, namely Common Semantics via Optimal Transport, to boost contrastive multi-view clustering via semantic learning in a common space that integrates all views.
View Article and Find Full Text PDFNuclei detection is one of the most fundamental and challenging problems in histopathological image analysis, which can localize nuclei to provide effective computer-aided cancer diagnosis, treatment decision, and prognosis. The fully-supervised nuclei detector requires a large number of nuclei annotations on high-resolution digital images, which is time-consuming and needs human annotators with professional knowledge. In recent years, weakly-supervised learning has attracted significant attention in reducing the labeling burden.
View Article and Find Full Text PDFSalient instance segmentation (SIS) is an emerging field that evolves from salient object detection (SOD), aiming at identifying individual salient instances using segmentation maps. Inspired by the success of dynamic convolutions in segmentation tasks, this article introduces a keypoints-based SIS network (KepSalinst). It employs multiple keypoints, that is, the center and several peripheral points of an instance, as effective geometrical guidance for dynamic convolutions.
View Article and Find Full Text PDFIEEE Trans Image Process
August 2023
Due to the light absorption and scattering induced by the water medium, underwater images usually suffer from some degradation problems, such as low contrast, color distortion, and blurring details, which aggravate the difficulty of downstream underwater understanding tasks. Therefore, how to obtain clear and visually pleasant images has become a common concern of people, and the task of underwater image enhancement (UIE) has also emerged as the times require. Among existing UIE methods, Generative Adversarial Networks (GANs) based methods perform well in visual aesthetics, while the physical model-based methods have better scene adaptability.
View Article and Find Full Text PDFIEEE Trans Image Process
May 2023
In recent years, various neural network architectures for computer vision have been devised, such as the visual transformer and multilayer perceptron (MLP). A transformer based on an attention mechanism can outperform a traditional convolutional neural network. Compared with the convolutional neural network and transformer, the MLP introduces less inductive bias and achieves stronger generalization.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
July 2024
IEEE Trans Image Process
October 2022
Focusing on the issue of how to effectively capture and utilize cross-modality information in RGB-D salient object detection (SOD) task, we present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement. For the cross-modality interaction, 1) a progressive attention guided integration unit is proposed to sufficiently integrate RGB-D feature representations in the encoder stage, and 2) a convergence aggregation structure is proposed, which flows the RGB and depth decoding features into the corresponding RGB-D decoding streams via an importance gated fusion unit in the decoder stage. For the cross-modality refinement, we insert a refinement middleware structure between the encoder and the decoder, in which the RGB, depth, and RGB-D encoder features are further refined by successively using a self-modality attention refinement unit and a cross-modality weighting refinement unit.
View Article and Find Full Text PDFIEEE Trans Cybern
March 2023
The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images. Therefore, how to effectively extract interimage correspondence is crucial for the CoSOD task. In this article, we propose a global-and-local collaborative learning (GLNet) architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM) to capture the comprehensive interimage corresponding relationship among different images from the global and local perspectives.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
August 2022
Clinically, proper polyp localization in endoscopy images plays a vital role in the follow-up treatment (e.g., surgical planning).
View Article and Find Full Text PDFOptical remote sensing images (RSIs) have been widely used in many applications, and one of the interesting issues about optical RSIs is the salient object detection (SOD). However, due to diverse object types, various object scales, numerous object orientations, and cluttered backgrounds in optical RSIs, the performance of the existing SOD models often degrade largely. Meanwhile, cutting-edge SOD models targeting optical RSIs typically focus on suppressing cluttered backgrounds, while they neglect the importance of edge information which is crucial for obtaining precise saliency maps.
View Article and Find Full Text PDFIEEE Trans Image Process
December 2021
RGB-D saliency detection is receiving more and more attention in recent years. There are many efforts have been devoted to this area, where most of them try to integrate the multi-modal information, i.e.
View Article and Find Full Text PDFIEEE Trans Image Process
May 2021
Underwater images suffer from color casts and low contrast due to wavelength- and distance-dependent attenuation and scattering. To solve these two degradation issues, we present an underwater image enhancement network via medium transmission-guided multi-color space embedding, called Ucolor. Concretely, we first propose a multi-color space encoder network, which enriches the diversity of feature representations by incorporating the characteristics of different color spaces into a unified structure.
View Article and Find Full Text PDFIEEE Trans Image Process
December 2020
Despite the remarkable advances in visual saliency analysis for natural scene images (NSIs), salient object detection (SOD) for optical remote sensing images (RSIs) still remains an open and challenging problem. In this paper, we propose an end-to-end Dense Attention Fluid Network (DAFNet) for SOD in optical RSIs. A Global Context-aware Attention (GCA) module is proposed to adaptively capture long-range semantic context relationships, and is further embedded in a Dense Attention Fluid (DAF) structure that enables shallow attention cues flow into deep layers to guide the generation of high-level feature attention maps.
View Article and Find Full Text PDFIEEE Trans Image Process
August 2021
There are two main issues in RGB-D salient object detection: (1) how to effectively integrate the complementarity from the cross-modal RGB-D data; (2) how to prevent the contamination effect from the unreliable depth map. In fact, these two problems are linked and intertwined, but the previous methods tend to focus only on the first problem and ignore the consideration of depth map quality, which may yield the model fall into the sub-optimal state. In this paper, we address these two issues in a holistic model synergistically, and propose a novel network named DPANet to explicitly model the potentiality of the depth map and effectively integrate the cross-modal complementarity.
View Article and Find Full Text PDFIEEE Trans Image Process
July 2020
The ability to synthesize multi-modality data is highly desirable for many computer-aided medical applications, e.g. clinical diagnosis and neuroscience research, since rich imaging cohorts offer diverse and complementary information unraveling human tissues.
View Article and Find Full Text PDFIEEE Trans Cybern
January 2021
Salient object detection from RGB-D images is an important yet challenging vision task, which aims at detecting the most distinctive objects in a scene by combining color information and depth constraints. Unlike prior fusion manners, we propose an attention steered interweave fusion network (ASIF-Net) to detect salient objects, which progressively integrates cross-modal and cross-level complementarity from the RGB image and corresponding depth map via steering of an attention mechanism. Specifically, the complementary features from RGB-D images are jointly extracted and hierarchically fused in a dense and interweaved manner.
View Article and Find Full Text PDFIEEE Trans Image Process
November 2019
Underwater image enhancement has been attracting much attention due to its significance in marine engineering and aquatic robotics. Numerous underwater image enhancement algorithms have been proposed in the last few years. However, these algorithms are mainly evaluated using either synthetic datasets or few selected real-world images.
View Article and Find Full Text PDFDepth information has been demonstrated to be useful for saliency detection. However, the existing methods for RGBD saliency detection mainly focus on designing straightforward and comprehensive models, while ignoring the transferable ability of the existing RGB saliency detection models. In this article, we propose a novel depth-guided transformation model (DTM) going from RGB saliency to RGBD saliency.
View Article and Find Full Text PDFIEEE Trans Image Process
October 2019
Video saliency detection aims to continuously discover the motion-related salient objects from the video sequences. Since it needs to consider the spatial and temporal constraints jointly, video saliency detection is more challenging than image saliency detection. In this paper, we propose a new method to detect the salient objects in video based on sparse reconstruction and propagation.
View Article and Find Full Text PDFIEEE Trans Image Process
December 2018
Rapid development of affordable and portable consumer depth cameras facilitates the use of depth information in many computer vision tasks such as intelligent vehicles and 3D reconstruction. However, depth map captured by low-cost depth sensors (e.g.
View Article and Find Full Text PDFAs a newly emerging and significant topic in computer vision community, co-saliency detection aims at discovering the common salient objects in multiple related images. The existing methods often generate the co-saliency map through a direct forward pipeline which is based on the designed cues or initialization, but lack the refinement-cycle scheme. Moreover, they mainly focus on RGB image and ignore the depth information for RGBD images.
View Article and Find Full Text PDFIEEE Trans Image Process
February 2018
Co-saliency detection aims at extracting the common salient regions from an image group containing two or more relevant images. It is a newly emerging topic in computer vision community. Different from the most existing co-saliency methods focusing on RGB images, this paper proposes a novel co-saliency detection model for RGBD images, which utilizes the depth information to enhance identification of co-saliency.
View Article and Find Full Text PDFIEEE Trans Image Process
December 2016
Images captured under water are usually degraded due to the effects of absorption and scattering. Degraded underwater images show some limitations when they are used for display and analysis. For example, underwater images with low contrast and color cast decrease the accuracy rate of underwater object detection and marine biology recognition.
View Article and Find Full Text PDF