AI Article Synopsis

  • Salient object detection (SOD) focuses on identifying important areas in images, and RGB-Thermal SOD enhances this by using both RGB and thermal data, but current methods often struggle due to a lack of consideration for how boundary pixels relate to each other.
  • The proposed Position-Aware Relation Learning Network (PRLNet) improves performance by learning pixel relationships through a signed distance map auxiliary module (SDMAM) and a feature refinement approach with direction fields (FRDF), which together enhance the separation and compactness of pixel features.
  • PRLNet has been tested on various RGB-T SOD datasets, showing superior results compared to existing methods, and is designed to work seamlessly with different backbone networks, with supporting studies

Article Abstract

Salient object detection (SOD) is an important task in computer vision that aims to identify visually conspicuous regions in images. RGB-Thermal SOD combines two spectra to achieve better segmentation results. However, most existing methods for RGB-T SOD use boundary maps to learn sharp boundaries, which lead to sub-optimal performance as they ignore the interactions between isolated boundary pixels and other confident pixels. To address this issue, we propose a novel position-aware relation learning network (PRLNet) for RGB-T SOD. PRLNet explores the distance and direction relationships between pixels by designing an auxiliary task and optimizing the feature structure to strengthen intra-class compactness and inter-class separation. Our method consists of two main components: A signed distance map auxiliary module (SDMAM), and a feature refinement approach with direction field (FRDF). SDMAM improves the encoder feature representation by considering the distance relationship between foreground-background pixels and boundaries, which increases the inter-class separation between foreground and background features. FRDF rectifies the features of boundary neighborhoods by exploiting the features inside salient objects. It utilizes the direction relationship of object pixels to enhance the intra-class compactness of salient features. In addition, we constitute a transformer-based decoder to decode multispectral feature representation. Experimental results on three public RGB-T SOD datasets demonstrate that our proposed method not only outperforms the state-of-the-art methods, but also can be integrated with different backbone networks in a plug-and-play manner. Ablation study and visualizations further prove the validity and interpretability of our method.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2023.3270801DOI Listing

Publication Analysis

Top Keywords

rgb-t sod
12
position-aware relation
8
relation learning
8
salient object
8
object detection
8
intra-class compactness
8
inter-class separation
8
feature representation
8
sod
5
pixels
5

Similar Publications

Wavelet-Driven Multi-Band Feature Fusion for RGB-T Salient Object Detection.

Sensors (Basel)

December 2024

School of Mechanical Engineering & Automation, Northeastern University, Shenyang 110819, China.

RGB-T salient object detection (SOD) has received considerable attention in the field of computer vision. Although existing methods have achieved notable detection performance in certain scenarios, challenges remain. Many methods fail to fully utilize high-frequency and low-frequency features during information interaction among different scale features, limiting detection performance.

View Article and Find Full Text PDF

Edge-guided feature fusion network for RGB-T salient object detection.

Front Neurorobot

December 2024

Department of Information Engineering, Shanghai Maritime University, Shanghai, China.

Introduction: RGB-T Salient Object Detection (SOD) aims to accurately segment salient regions in both visible light and thermal infrared images. However, many existing methods overlook the critical complementarity between these modalities, which can enhance detection accuracy.

Methods: We propose the Edge-Guided Feature Fusion Network (EGFF-Net), which consists of cross-modal feature extraction, edge-guided feature fusion, and salience map prediction.

View Article and Find Full Text PDF

Salient object detection in low-light RGB-T scene via spatial-frequency cues mining.

Neural Netw

October 2024

School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China. Electronic address:

Low-light conditions pose significant challenges to vision tasks, such as salient object detection (SOD), due to insufficient photons. Light-insensitive RGB-T SOD models mitigate the above problems to some extent, but they are limited in performance as they only focus on spatial feature fusion while ignoring the frequency discrepancy. To this end, we propose an RGB-T SOD model by mining spatial-frequency cues, called SFMNet, for low-light scenes.

View Article and Find Full Text PDF

RGB-T salient object detection (SOD) has made significant progress in recent years. However, most existing works are based on heavy models, which are not applicable to mobile devices. Additionally, there is still room for improvement in the design of cross-modal feature fusion and cross-level feature fusion.

View Article and Find Full Text PDF

RGB-D salient object detection (SOD) has gained tremendous attention in recent years. In particular, transformer has been employed and shown great potential. However, existing transformer models usually overlook the vital edge information, which is a major issue restricting the further improvement of SOD accuracy.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!