Multi-modal Learning with Missing Data for Cancer Diagnosis Using Histopathological and Genomic Data.

Can Cui Zuhayr Asad William F Dean Isabelle T Smith Christopher Madden Shunxing Bao Bennett A Landman Joseph T Roland Lori A Coburn Keith T Wilson Jeffrey P Zwerner Shilin Zhao Lee E Wheless Yuankai Huo

Proc SPIE Int Soc Opt Eng

Department of Computer Science, Vanderbilt University, Nashville, TN 37235, USA.

Published: April 2022

Multi-modal learning (e.g., integrating pathological images with genomic features) tends to improve the accuracy of cancer diagnosis and prognosis as compared to learning with a single modality. However, missing data is a common problem in clinical practice, i.e., not every patient has all modalities available. Most of the previous works directly discarded samples with missing modalities, which might lose information in these data and increase the likelihood of overfitting. In this work, we generalize the multi-modal learning in cancer diagnosis with the capacity of dealing with missing data using histological images and genomic data. Our integrated model can utilize all available data from patients with both complete and partial modalities. The experiments on the public TCGA-GBM and TCGA-LGG datasets show that the data with missing modalities can contribute to multi-modal learning, which improves the model performance in grade classification of glioma cancer.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9605813	PMC
http://dx.doi.org/10.1117/12.2612318	DOI Listing

Publication Analysis

Top Keywords

multi-modal learning

missing data

cancer diagnosis

data

genomic data

images genomic

missing modalities

missing

multi-modal

learning missing

Similar Publications

An empirical study of LLaMA3 quantization: from LLMs to MLLMs.

Vis Intell

December 2024

Department of Information Technology and Electrical Engineering, ETH Zurich, Sternwartstrasse 7, Zürich, Switzerland.

Wei Huang Xingyu Zheng Xudong Ma Haotong Qin Chengtao Lv

The LLaMA family, a collection of foundation language models ranging from 7B to 65B parameters, has become one of the most powerful open-source large language models (LLMs) and the popular LLM backbone of multi-modal large language models (MLLMs), widely used in computer vision and natural language understanding tasks. In particular, LLaMA3 models have recently been released and have achieved impressive performance in various domains with super-large scale pre-training on over 15T tokens of data. Given the wide application of low-bit quantization for LLMs in resource-constrained scenarios, we explore LLaMA3's capabilities when quantized to low bit-width.

View Article and Find Full Text PDF

Similar Publications

MMFuncPhos: A Multi-Modal Learning Framework for Identifying Functional Phosphorylation Sites and Their Regulatory Types.

Adv Sci (Weinh)

January 2025

Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China.

Juan Xie Ruihan Dong Jintao Zhu Haoyu Lin Shiwei Wang

Protein phosphorylation plays a crucial role in regulating a wide range of biological processes, and its dysregulation is strongly linked to various diseases. While many phosphorylation sites have been identified so far, their functionality and regulatory effects are largely unknown. Here, a deep learning model MMFuncPhos, based on a multi-modal deep learning framework, is developed to predict functional phosphorylation sites.

View Article and Find Full Text PDF

Similar Publications

Integrating genetics, metabolites, and clinical characteristics in predicting cardiometabolic health outcomes using machine learning algorithms - A systematic review.

Comput Biol Med

January 2025

Hugh Sinclair Unit of Human Nutrition, Department of Food and Nutritional Sciences and Institute for Cardiovascular and Metabolic Research (ICMR), University of Reading, Reading, RG6 6DZ, UK; Institute for Food, Nutrition and Health (IFNH), University of Reading, Reading, RG6 6AH, UK. Electronic address:

Xianyu Zhu Eduard F Ventura Sakshi Bansal Anisha Wijeyesekera Karani S Vimaleswaran

Background: Machine learning (ML) integration of clinical, metabolite, and genetic data reveals variable results in predicting cardiometabolic health (CMH) outcomes. Therefore, we aim to (1) evaluate whether a multi-modal approach incorporating all three data types using ML algorithms can improve CMH outcome prediction compared to single-modal or paired-modal models, and (2) compare the methodologies used in existing prediction models.

Methods: We systematically searched five databases from 1998 to 2024 for ML predictive modelling studies using the multi-modal approach for CMH outcomes.

View Article and Find Full Text PDF

Similar Publications

nipalsMCIA: Flexible Multi-Block Dimensionality Reduction in R via Nonlinear Iterative Partial Least Squares.

Bioinformatics

January 2025

Department of Pathology and Department of Immunobiology, Yale School of Medicine.

Max Mattessich Joaquin Reyna Edel Aron Ferhat Ay Misha Kilmer

Summary: With the increased reliance on multi-omics data for bulk and single cell analyses, the availability of robust approaches to perform unsupervised learning for clustering, visualization, and feature selection is imperative. We introduce nipalsMCIA, an implementation of multiple co-inertia analysis (MCIA) for joint dimensionality reduction that solves the objective function using an extension to Non-linear Iterative Partial Least Squares (NIPALS). We applied nipalsMCIA to both bulk and single cell datasets and observed significant speed-up over other implementations for data with a large sample size and/or feature dimension.

View Article and Find Full Text PDF

Similar Publications

A Comparison Study of Person Identification Using IR Array Sensors and LiDAR.

Sensors (Basel)

January 2025

Faculty of Science and Technology, Keio University, Yokohama 223-8522, Japan.

Kai Liu Mondher Bouazizi Zelin Xing Tomoaki Ohtsuki

Person identification is a critical task in applications such as security and surveillance, requiring reliable systems that perform robustly under diverse conditions. This study evaluates the Vision Transformer (ViT) and ResNet34 models across three modalities-RGB, thermal, and depth-using datasets collected with infrared array sensors and LiDAR sensors in controlled scenarios and varying resolutions (16 × 12 to 640 × 480) to explore their effectiveness in person identification. Preprocessing techniques, including YOLO-based cropping, were employed to improve subject isolation.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!