Multimodal learning is expected to boost model performance by integrating information from different modalities. However, its potential is not fully exploited because the widely-used joint training strategy, which has a uniform objective for all modalities, leads to imbalanced and under-optimized uni-modal representations. Specifically, we point out that there often exists modality with more discriminative information, e.g., vision of playing football and sound of blowing wind. They could dominate the joint training process, resulting in other modalities being significantly under-optimized. To alleviate this problem, we first analyze the under-optimized phenomenon from both the feed-forward and the back-propagation stages during optimization. Then, On-the-fly Prediction Modulation (OPM) and On-the-fly Gradient Modulation (OGM) strategies are proposed to modulate the optimization of each modality, by monitoring the discriminative discrepancy between modalities during training. Concretely, OPM weakens the influence of the dominant modality by dropping its feature with dynamical probability in the feed-forward stage, while OGM mitigates its gradient in the back-propagation stage. In experiments, our methods demonstrate considerable improvement across a variety of multimodal tasks. These simple yet effective strategies not only enhance performance in vanilla and task-oriented multimodal models, but also in more complex multimodal tasks, showcasing their effectiveness and flexibility.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2024.3468315 | DOI Listing |
Biomed Phys Eng Express
January 2025
Radiation Oncology, Emory University, Emory Midtown Hospital, Atlanta, Georgia, 30322, UNITED STATES.
Although radiotherapy techniques are the primary treatment for head and neck cancer (HNC), they are still associated with substantial toxicity, and side effect. Machine learning (ML) based radiomics models for predicting toxicity mostly rely on features extracted from pre-treatment imaging data. This study aims to compare different models in predicting radiation-induced xerostomia and sticky saliva in both early and late stage of HNC patients using CT and MRI image features along with demographics and dosimetric information.
View Article and Find Full Text PDFClin Oral Investig
January 2025
Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, 310006, China.
Objectives: To evaluate recent advances in the automatic multimodal registration of cone-beam computed tomography (CBCT) and intraoral scans (IOS) and their clinical significance in dentistry.
Methods: A comprehensive literature search was conducted in October 2024 across the PubMed, Web of Science, and IEEE Xplore databases, including studies that were published in the past decade. The inclusion criteria were as follows: English-language studies, randomized and nonrandomized controlled trials, cohort studies, case-control studies, cross-sectional studies, and retrospective studies.
Radiol Artif Intell
January 2025
Human Phenome Institute and Shanghai Pudong Hospital, Fudan University, Shanghai, China.
. The released CMRxRecon2024 dataset is currently the largest and most protocol-diverse publicly available k-space dataset including multi-modality and multi-view cardiac MRI data from 330 healthy volunteers, and each one covers standardized and commonly used clinical protocols. ©RSNA, 2025.
View Article and Find Full Text PDFiScience
February 2025
Division of Newborn Medicine, Department of Pediatrics, Boston Children's Hospital, Boston, MA, USA.
Neurodevelopmental impairments associated with congenital heart disease (CHD) may arise from perturbations in brain developmental pathways, including the formation of sulcal patterns. While genetic factors contribute to sulcal features, the association of noncoding variants (ncDNVs) with sulcal patterns in people with CHD remains poorly understood. Leveraging deep learning models, we examined the predicted impact of ncDNVs on gene regulatory signals.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!