Multimodal medical imaging plays a pivotal role in clinical diagnosis and research, as it combines information from various imaging modalities to provide a more comprehensive understanding of the underlying pathology. Recently, deep learning-based multimodal fusion techniques have emerged as powerful tools for improving medical image classification. This review offers a thorough analysis of the developments in deep learning-based multimodal fusion for medical classification tasks. We explore the complementary relationships among prevalent clinical modalities and outline three main fusion schemes for multimodal classification networks: input fusion, intermediate fusion (encompassing single-level fusion, hierarchical fusion, and attention-based fusion), and output fusion. By evaluating the performance of these fusion techniques, we provide insight into the suitability of different network architectures for various multimodal fusion scenarios and application domains. Furthermore, we delve into challenges related to network architecture selection, handling incomplete multimodal data management, and the potential limitations of multimodal fusion. Finally, we spotlight the promising future of Transformer-based multimodal fusion techniques and give recommendations for future research in this rapidly evolving field.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiomed.2024.108635DOI Listing

Publication Analysis

Top Keywords

multimodal fusion
20
fusion techniques
16
fusion
14
deep learning-based
12
multimodal
9
multimodal medical
8
medical image
8
image classification
8
learning-based multimodal
8
review deep
4

Similar Publications

Multi-Task Federated Split Learning Across Multi-Modal Data with Privacy Preservation.

Sensors (Basel)

January 2025

State Key Laboratory of Intelligent Vehicle Safety Technology, Chongqing 401133, China.

With the advancement of federated learning (FL), there is a growing demand for schemes that support multi-task learning on multi-modal data while ensuring robust privacy protection, especially in applications like intelligent connected vehicles. Traditional FL schemes often struggle with the complexities introduced by multi-modal data and diverse task requirements, such as increased communication overhead and computational burdens. In this paper, we propose a novel privacy-preserving scheme for multi-task federated split learning across multi-modal data (MTFSLaMM).

View Article and Find Full Text PDF

Driving-Related Cognitive Abilities Prediction Based on Transformer's Multimodal Fusion Framework.

Sensors (Basel)

December 2024

Faculty of Information Science and Technology, Beijing University of Technology, Beijing 100124, China.

With the increasing complexity of urban roads and rising traffic flow, traffic safety has become a critical societal concern. Current research primarily addresses drivers' attention, reaction speed, and perceptual abilities, but comprehensive assessments of cognitive abilities in complex traffic environments are lacking. This study, grounded in cognitive science and neuropsychology, identifies and quantitatively evaluates ten cognitive components related to driving decision-making, execution, and psychological states by analyzing video footage of drivers' actions.

View Article and Find Full Text PDF

Accurate depth estimation is crucial for many fields, including robotics, navigation, and medical imaging. However, conventional depth sensors often produce low-resolution (LR) depth maps, making detailed scene perception challenging. To address this, enhancing LR depth maps to high-resolution (HR) ones has become essential, guided by HR-structured inputs like RGB or grayscale images.

View Article and Find Full Text PDF

Background/objectives: Parkinson's disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal patterns in PD patients.

View Article and Find Full Text PDF

Background: Neurodegenerative diseases (NGD) encompass a range of progressive neurological conditions, such as Alzheimer's disease (AD) and Parkinson's disease (PD), characterised by the gradual deterioration of neuronal structure and function. This degeneration manifests as cognitive decline, movement impairment, and dementia. Our focus in this investigation is on PD, a neurodegenerative disorder characterized by the loss of dopamine-producing neurons in the brain, leading to motor disturbances.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!