Fusing features from different sources is a critical aspect of many computer vision tasks. Existing approaches can be roughly categorized as parameter-free or learnable operations. However, parameter-free modules are limited in their ability to benefit from offline learning, leading to poor performance in some challenging situations. Learnable fusing methods are often space-consuming and time-consuming, particularly when fusing features with different shapes. To address these shortcomings, we conducted an in-depth analysis of the limitations associated with both fusion methods. Based on our findings, we propose a generalized module named Asymmetric Convolution Module (ACM). This module can learn to encode effective priors during offline training and efficiently fuse feature maps with different shapes in specific tasks. Specifically, we propose a mathematically equivalent method for replacing costly convolutions on concatenated features. This method can be widely applied to fuse feature maps across different shapes. Furthermore, distinguished from parameter-free operations that can only fuse two features of the same type, our ACM is general, flexible, and can fuse multiple features of different types. To demonstrate the generality and efficiency of ACM, we integrate it into several state-of-the-art models on three representative vision tasks. Extensive experimental results on three tasks and several datasets demonstrate that our new module can bring significant improvements and noteworthy efficiency.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2024.3400873 | DOI Listing |
Sci Rep
January 2025
Shandong University of Science and Technology, Qingdao, 266590, China.
Explainability is a very active area of research in machine learning and image processing. This paper aims to investigate the explainability of visible light and infrared image fusion technology in order to enhance the credibility of model understanding and application. Firstly, a multimodal image fusion model was proposed based on the advantages of convolutional neural networks (CNN) for local context extraction and Transformer global attention mechanism.
View Article and Find Full Text PDFSci Rep
January 2025
Hangzhou Xiangce Electronic Technology Co.Ltd, Hangzhou, 310018, China.
Accurately predicting the State of Health (SOH) of new energy vehicle batteries is critical for ensuring their reliable operation and extending battery's service life. To address the issue of low SOH prediction accuracy across different prediction lengths, this paper proposes a prediction method based on long-short-term battery degradation feature extraction and FEA-TimeMixer model. First, a novel automatic SOH extraction algorithm for offline charging data is introduced to label the battery SOH degradation data.
View Article and Find Full Text PDFComput Med Imaging Graph
December 2024
Nantes Université, Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France.
Diffuse Large B-cell Lymphoma (DLBCL) is a lymphatic cancer of steadily growing incidence. Its diagnostic and follow-up rely on the analysis of clinical biomarkers and 18F-Fluorodeoxyglucose (FDG)-PET/CT images. In this context, we target the problem of assisting in the early identification of high-risk DLBCL patients from both images and tabular clinical data.
View Article and Find Full Text PDFAcad Radiol
January 2025
Department of Radiology, Luzhou People's Hospital, Luzhou 646000, China (S.Z., J.C., A.R., X.Z., J.H., M.Y., F.W.). Electronic address:
Rationale And Objectives: Inflammation and immune biomarkers can promote angiogenesis and proliferation and metastasis of esophageal squamous cell carcinoma (ESCC). The degree of pathological grade reflects the tumor heterogeneity of ESCC. The purpose is to develop and validate a nomogram based on enhanced CT multidimensional radiomics combined with inflammatory immune score (IIS) for predicting poorly differentiated ESCC.
View Article and Find Full Text PDFMed Biol Eng Comput
January 2025
Non-Invasive Imaging and Diagnostic Laboratory, Department of Applied Mechanics and Biomedical Engineering, Indian Institute of Technology Madras, Chennai, India.
Detection of early mild cognitive impairment (EMCI) is clinically challenging as it involves subtle alterations in multiple brain sub-anatomic regions. Among different brain regions, the corpus callosum and lateral ventricles are primarily affected due to EMCI. In this study, an improved deep canonical correlation analysis (CCA) based framework is proposed to fuse magnetic resonance (MR) image features from lateral ventricular and corpus callosal structures for the detection of EMCI condition.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!