A dual-stream feature decomposition network with weight transformation for multi-modality image fusion.

Sci Rep

School of Computer Science and Artificial Intelligence, Zhengzhou University, Zhengzhou, 450001, China.

Published: March 2025

As an image enhancement technology, multi-modal image fusion primarily aims to retain salient information from multi-source image pairs in a single image, generating imaging information that contains complementary features and can facilitate downstream visual tasks. However, dual-stream methods with convolutional neural networks (CNNs) as backbone networks predominantly have limited receptive fields, whereas methods with Transformers are time-consuming, and both lack the exploration of cross-domain information. This study proposes an innovative image fusion model designed for multi-modal images, encompassing pairs of infrared and visible images and multi-source medical images. Our model leverages the strengths of both Transformers and CNNs to model various feature types effectively, addressing both short- and long-range learning as well as the extraction of low- and high-frequency features. First, our shared encoder is constructed based on Transformers for long-range learning, including an intra-modal feature extraction block, an inter-modal feature extraction block, and a novel feature alignment block that handles slight misalignments. Our private encoder for extracting low- and high-frequency features employs a dual-stream architecture based on CNNs, which includes a dual-domain selection mechanism and an invertible neural network. Second, we develop a cross-attention-based Swin Transformer block to explore cross-domain information. In particular, we introduce a weight transformation that is embedded into the Transformer block to enhance the efficiency. Third, a unified loss function incorporating a dynamic weighting factor is formulated to capture the inherent commonalities of multi-modal images. A comprehensive qualitative and quantitative analysis of image fusion and object detection experimental results demonstrates that the proposed method effectively preserves thermal targets and background texture details, surpassing state-of-the-art alternatives in terms of achieving high-quality image fusion and improving the performance in subsequent visual tasks.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11876600PMC
http://dx.doi.org/10.1038/s41598-025-92054-0DOI Listing

Publication Analysis

Top Keywords

image fusion
20
weight transformation
8
image
8
visual tasks
8
multi-modal images
8
long-range learning
8
low- high-frequency
8
high-frequency features
8
feature extraction
8
extraction block
8

Similar Publications

Abnormality-aware multimodal learning for WSI classification.

Front Med (Lausanne)

February 2025

Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX, United States.

Whole slide images (WSIs) play a vital role in cancer diagnosis and prognosis. However, their gigapixel resolution, lack of pixel-level annotations, and reliance on unimodal visual data present challenges for accurate and efficient computational analysis. Existing methods typically divide WSIs into thousands of patches, which increases computational demands and makes it challenging to effectively focus on diagnostically relevant regions.

View Article and Find Full Text PDF

Exploring the effects of electron donor (D) and acceptor (A) functional groups in tuning the condensed state properties has been a challenging yet efficient approach to reveal promising materials for cutting-edge applications. Herein, a series of boron-nitrogen (BN) incorporated organic congeners (NBNMe2, NBOMe, NBF, NBCl, NBBr, NBCN, NBPy) appended with functional groups having various degrees of D/A characteristics were developed and their potential in controlling supramolecular assembly and condensed state luminescence features (>90 nm redshift in ) was explored. Despite the minor structural engineering in BN-based small molecules, they effectively modulated conformational orientation and molecular packing, leading to the directed growth of distinct and highly ordered self-assembly patterns, , nanosheets, nanospheres, nanowires, and nanorods.

View Article and Find Full Text PDF

Background: Ankle arthrodesis is the most frequently performed salvage procedure for pyogenic arthritis. However, its failed fusion rate of approximately 15% has been considered problematic. Herein, we present a case of pyogenic ankle arthritis successfully treated via a two-stage surgical procedure on the basis of the induced membrane technique.

View Article and Find Full Text PDF

Spontaneous psoas hematoma following posterior lumbar fusion surgery: a mini literature review.

BMC Musculoskelet Disord

March 2025

Department of Spine Surgery, Shanghai East Hospital, School of Medicine, Tongji University, Shanghai, 200092, China.

Background: Spontaneous psoas hematoma is a very rare clinical entity, and the pathogenesis and pathologic mechanisms of spontaneous psoas hematoma remain unclear, thus, it is of great value to explore.

Case Presentation: We encountered a patient who developed femoral nerve paralysis due to psoas muscle hematoma following posterior lumbar fusion surgery. A 69-year-old female with lumbar spinal canal stenosis underwent posterior lumbar fusion at the L3-4 and L4-5 levels.

View Article and Find Full Text PDF

[F]FDG PET/CT is the most widely used PET radiopharmaceutical in oncology, but it is not exempt of diagnostic limitations. FAPI have emerged as a great tool in the management of several different solid tumours in which [F]FDG is not able to provide enough information. The aim of this work was to evaluate the available evidence on diagnostic and therapeutic applications of PET/CT with FAPI radiopharmaceuticals.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!