Multi-modal learning (e.g., integrating pathological images with genomic features) tends to improve the accuracy of cancer diagnosis and prognosis as compared to learning with a single modality. However, missing data is a common problem in clinical practice, i.e., not every patient has all modalities available. Most of the previous works directly discarded samples with missing modalities, which might lose information in these data and increase the likelihood of overfitting. In this work, we generalize the multi-modal learning in cancer diagnosis with the capacity of dealing with missing data using histological images and genomic data. Our integrated model can utilize all available data from patients with both complete and partial modalities. The experiments on the public TCGA-GBM and TCGA-LGG datasets show that the data with missing modalities can contribute to multi-modal learning, which improves the model performance in grade classification of glioma cancer.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9605813PMC
http://dx.doi.org/10.1117/12.2612318DOI Listing

Publication Analysis

Top Keywords

multi-modal learning
16
missing data
12
cancer diagnosis
12
data
8
genomic data
8
images genomic
8
missing modalities
8
missing
5
multi-modal
4
learning missing
4

Similar Publications

An empirical study of LLaMA3 quantization: from LLMs to MLLMs.

Vis Intell

December 2024

Department of Information Technology and Electrical Engineering, ETH Zurich, Sternwartstrasse 7, Zürich, Switzerland.

The LLaMA family, a collection of foundation language models ranging from 7B to 65B parameters, has become one of the most powerful open-source large language models (LLMs) and the popular LLM backbone of multi-modal large language models (MLLMs), widely used in computer vision and natural language understanding tasks. In particular, LLaMA3 models have recently been released and have achieved impressive performance in various domains with super-large scale pre-training on over 15T tokens of data. Given the wide application of low-bit quantization for LLMs in resource-constrained scenarios, we explore LLaMA3's capabilities when quantized to low bit-width.

View Article and Find Full Text PDF

MMFuncPhos: A Multi-Modal Learning Framework for Identifying Functional Phosphorylation Sites and Their Regulatory Types.

Adv Sci (Weinh)

January 2025

Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China.

Protein phosphorylation plays a crucial role in regulating a wide range of biological processes, and its dysregulation is strongly linked to various diseases. While many phosphorylation sites have been identified so far, their functionality and regulatory effects are largely unknown. Here, a deep learning model MMFuncPhos, based on a multi-modal deep learning framework, is developed to predict functional phosphorylation sites.

View Article and Find Full Text PDF

Integrating genetics, metabolites, and clinical characteristics in predicting cardiometabolic health outcomes using machine learning algorithms - A systematic review.

Comput Biol Med

January 2025

Hugh Sinclair Unit of Human Nutrition, Department of Food and Nutritional Sciences and Institute for Cardiovascular and Metabolic Research (ICMR), University of Reading, Reading, RG6 6DZ, UK; Institute for Food, Nutrition and Health (IFNH), University of Reading, Reading, RG6 6AH, UK. Electronic address:

Background: Machine learning (ML) integration of clinical, metabolite, and genetic data reveals variable results in predicting cardiometabolic health (CMH) outcomes. Therefore, we aim to (1) evaluate whether a multi-modal approach incorporating all three data types using ML algorithms can improve CMH outcome prediction compared to single-modal or paired-modal models, and (2) compare the methodologies used in existing prediction models.

Methods: We systematically searched five databases from 1998 to 2024 for ML predictive modelling studies using the multi-modal approach for CMH outcomes.

View Article and Find Full Text PDF

Summary: With the increased reliance on multi-omics data for bulk and single cell analyses, the availability of robust approaches to perform unsupervised learning for clustering, visualization, and feature selection is imperative. We introduce nipalsMCIA, an implementation of multiple co-inertia analysis (MCIA) for joint dimensionality reduction that solves the objective function using an extension to Non-linear Iterative Partial Least Squares (NIPALS). We applied nipalsMCIA to both bulk and single cell datasets and observed significant speed-up over other implementations for data with a large sample size and/or feature dimension.

View Article and Find Full Text PDF

A Comparison Study of Person Identification Using IR Array Sensors and LiDAR.

Sensors (Basel)

January 2025

Faculty of Science and Technology, Keio University, Yokohama 223-8522, Japan.

Person identification is a critical task in applications such as security and surveillance, requiring reliable systems that perform robustly under diverse conditions. This study evaluates the Vision Transformer (ViT) and ResNet34 models across three modalities-RGB, thermal, and depth-using datasets collected with infrared array sensors and LiDAR sensors in controlled scenarios and varying resolutions (16 × 12 to 640 × 480) to explore their effectiveness in person identification. Preprocessing techniques, including YOLO-based cropping, were employed to improve subject isolation.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!