Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection.

Hao Tang Zechao Li Dong Zhang Shengfeng He Jinhui Tang

IEEE Trans Pattern Anal Mach Intell

Published: December 2024

RGB-Thermal Salient Object Detection (RGB-T SOD) aims to pinpoint prominent objects within aligned pairs of visible and thermal infrared images. A key challenge lies in bridging the inherent disparities between RGB and Thermal modalities for effective saliency map prediction. Traditional encoder-decoder architectures, while designed for cross-modality feature interactions, may not have adequately considered the robustness against noise originating from defective modalities, thereby leading to suboptimal performance in complex scenarios. Inspired by hierarchical human visual systems, we propose the ConTriNet, a robust Confluent Triple-Flow Network employing a "Divide-and-Conquer" strategy. This framework utilizes a unified encoder with specialized decoders, each addressing different subtasks of exploring modality-specific and modality-complementary information for RGB-T SOD, thereby enhancing the final saliency map prediction. Specifically, ConTriNet comprises three flows: two modality-specific flows explore cues from RGB and Thermal modalities, and a third modality-complementary flow integrates cues from both modalities. ConTriNet presents several notable advantages. It incorporates a Modality-induced Feature Modulator (MFM) in the modality-shared union encoder to minimize inter-modality discrepancies and mitigate the impact of defective samples. Additionally, a foundational Residual Atrous Spatial Pyramid Module (RASPM) in the separated flows enlarges the receptive field, allowing for the capture of multi-scale contextual information. Furthermore, a Modality-aware Dynamic Aggregation Module (MDAM) in the modality-complementary flow dynamically aggregates saliency-related cues from both modality-specific flows. Leveraging the proposed parallel triple-flow framework, we further refine saliency maps derived from different flows through a flow-cooperative fusion strategy, yielding a high-quality, full-resolution saliency map for the final prediction. To evaluate the robustness and stability of our approach, we collect a comprehensive RGB-T SOD benchmark, VT-IMAG, covering various real-world challenging scenarios. Extensive experiments on public benchmarks and our VT-IMAG dataset demonstrate that ConTriNet consistently outperforms state-of-the-art competitors in both common and challenging scenarios, even when dealing with incomplete modality data. The code and VT-IMAG will be available at: https://cser-tang-hao.github.io/contrinet.html.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TPAMI.2024.3511621	DOI Listing

Publication Analysis

Top Keywords

rgb-t sod

saliency map

confluent triple-flow

triple-flow network

salient object

object detection

rgb thermal

thermal modalities

map prediction

modality-specific flows

Similar Publications

Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection.

IEEE Trans Pattern Anal Mach Intell

December 2024

Hao Tang Zechao Li Dong Zhang Shengfeng He Jinhui Tang

View Article and Find Full Text PDF

Similar Publications

Wavelet-Driven Multi-Band Feature Fusion for RGB-T Salient Object Detection.

Sensors (Basel)

December 2024

School of Mechanical Engineering & Automation, Northeastern University, Shenyang 110819, China.

Jianxun Zhao Xin Wen Yu He Xiaowei Yang Kechen Song

RGB-T salient object detection (SOD) has received considerable attention in the field of computer vision. Although existing methods have achieved notable detection performance in certain scenarios, challenges remain. Many methods fail to fully utilize high-frequency and low-frequency features during information interaction among different scale features, limiting detection performance.

View Article and Find Full Text PDF

Similar Publications

Edge-guided feature fusion network for RGB-T salient object detection.

Front Neurorobot

December 2024

Department of Information Engineering, Shanghai Maritime University, Shanghai, China.

Yuanlin Chen Zengbao Sun Cheng Yan Ming Zhao

Introduction: RGB-T Salient Object Detection (SOD) aims to accurately segment salient regions in both visible light and thermal infrared images. However, many existing methods overlook the critical complementarity between these modalities, which can enhance detection accuracy.

Methods: We propose the Edge-Guided Feature Fusion Network (EGFF-Net), which consists of cross-modal feature extraction, edge-guided feature fusion, and salience map prediction.

View Article and Find Full Text PDF

Similar Publications

Salient object detection in low-light RGB-T scene via spatial-frequency cues mining.

Neural Netw

October 2024

School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China. Electronic address:

Huihui Yue Jichang Guo Xiangjun Yin Yi Zhang Sida Zheng

Low-light conditions pose significant challenges to vision tasks, such as salient object detection (SOD), due to insufficient photons. Light-insensitive RGB-T SOD models mitigate the above problems to some extent, but they are limited in performance as they only focus on spatial feature fusion while ignoring the frequency discrepancy. To this end, we propose an RGB-T SOD model by mining spatial-frequency cues, called SFMNet, for low-light scenes.

View Article and Find Full Text PDF

Similar Publications

Lightweight Cross-Modal Information Mutual Reinforcement Network for RGB-T Salient Object Detection.

Entropy (Basel)

January 2024

School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China.

Chengtao Lv Bin Wan Xiaofei Zhou Yaoqi Sun Jiyong Zhang

RGB-T salient object detection (SOD) has made significant progress in recent years. However, most existing works are based on heavy models, which are not applicable to mobile devices. Additionally, there is still room for improvement in the design of cross-modal feature fusion and cross-level feature fusion.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!