Swin Transformer-Based Edge Guidance Network for RGB-D Salient Object Detection.

Sensors (Basel)

Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China.

Published: October 2023

Salient object detection (SOD), which is used to identify the most distinctive object in a given scene, plays an important role in computer vision tasks. Most existing RGB-D SOD methods employ a CNN-based network as the backbone to extract features from RGB and depth images; however, the inherent locality of a CNN-based network limits the performance of CNN-based methods. To tackle this issue, we propose a novel Swin Transformer-based edge guidance network (SwinEGNet) for RGB-D SOD in which the Swin Transformer is employed as a powerful feature extractor to capture the global context. An edge-guided cross-modal interaction module is proposed to effectively enhance and fuse features. In particular, we employed the Swin Transformer as the backbone to extract features from RGB images and depth maps. Then, we introduced the edge extraction module (EEM) to extract edge features and the depth enhancement module (DEM) to enhance depth features. Additionally, a cross-modal interaction module (CIM) was used to integrate cross-modal features from global and local contexts. Finally, we employed a cascaded decoder to refine the prediction map in a coarse-to-fine manner. Extensive experiments demonstrated that our SwinEGNet achieved the best performance on the LFSD, NLPR, DES, and NJU2K datasets and achieved comparable performance on the STEREO dataset compared to 14 state-of-the-art methods. Our model achieved better performance compared to SwinNet, with 88.4% parameters and 77.2% FLOPs. Our code will be publicly available.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10650861	PMC
http://dx.doi.org/10.3390/s23218802	DOI Listing

Publication Analysis

Top Keywords

swin transformer-based

transformer-based edge

edge guidance

guidance network

salient object

object detection

rgb-d sod

cnn-based network

backbone extract

extract features

Similar Publications

Optimizing Transformer-Based Network via Advanced Decoder Design for Medical Image Segmentation.

Biomed Phys Eng Express

January 2025

Shandong University, No. 72, Binhai Road, Jimo, Qingdao City, Shandong Province, Qingdao, 266200, CHINA.

Weibin Yang Zhiqi Dong Mingyuan Xu Longwei Xu Dehua Geng

U-Net is widely used in medical image segmentation due to its simple and flexible architecture design. To address the challenges of scale and complexity in medical tasks, several variants of U-Net have been proposed. In particular, methods based on Vision Transformer (ViT), represented by Swin UNETR, have gained widespread attention in recent years.

View Article and Find Full Text PDF

Similar Publications

Automated elbow ultrasound image recognition: a two-stage deep learning system via Swin Transformer.

Quant Imaging Med Surg

January 2025

Department of Medical Ultrasound, West China Hospital of Sichuan University, Chengdu, China.

Weichen Zhou Chengting Zhou Lirong Hu Li Qiu

Background: Ultrasound imaging is pivotal for point of care non-invasive diagnosis of musculoskeletal (MSK) injuries. Notably, MSK ultrasound demands a higher level of operator expertise compared to general ultrasound procedures, necessitating thorough checks on image quality and precise categorization of each image. This need for skilled assessment highlights the importance of developing supportive tools for quality control and categorization in clinical settings.

View Article and Find Full Text PDF

Similar Publications

Abdominal synthetic CT generation for MR-only radiotherapy using structure-conserving loss and transformer-based cycle-GAN.

Front Oncol

January 2025

Department of Radiation Oncology, Yonsei Cancer Center, Heavy Ion Therapy Research Institute, Yonsei University College of Medicine, Seoul, Republic of Korea.

Chanwoong Lee Young Hun Yoon Jiwon Sung Jun Won Kim Yeona Cho

Purpose: Recent deep-learning based synthetic computed tomography (sCT) generation using magnetic resonance (MR) images have shown promising results. However, generating sCT for the abdominal region poses challenges due to the patient motion, including respiration and peristalsis. To address these challenges, this study investigated an unsupervised learning approach using a transformer-based cycle-GAN with structure-preserving loss for abdominal cancer patients.

View Article and Find Full Text PDF

Similar Publications

Transformer-based neural speech decoding from surface and depth electrode signals.

J Neural Eng

January 2025

Electrical and Computer Engineering Department, New York University, 370 Jay Street, Brooklyn, NY 11201, United States of America.

Junbo Chen Xupeng Chen Ran Wang Chenqian Le Amirhossein Khalilian-Gourtani

This study investigates speech decoding from neural signals captured by intracranial electrodes. Most prior works can only work with electrodes on a 2D grid (i.e.

View Article and Find Full Text PDF

Similar Publications

Prior-FOVNet: A Multimodal Deep Learning Framework for Megavoltage Computed Tomography Truncation Artifact Correction and Field-of-View Extension.

Sensors (Basel)

December 2024

School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China.

Long Tang Mengxun Zheng Peiwen Liang Zifeng Li Yongqi Zhu

Megavoltage computed tomography (MVCT) plays a crucial role in patient positioning and dose reconstruction during tomotherapy. However, due to the limited scan field of view (sFOV), the entire cross-section of certain patients may not be fully covered, resulting in projection data truncation. Truncation artifacts in MVCT can compromise registration accuracy with the planned kilovoltage computed tomography (KVCT) and hinder subsequent MVCT-based adaptive planning.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!