We train prediction and survival models using multi-omics data for disease risk identification and stratification. Existing work on disease prediction focuses on risk analysis using datasets of individual data types (metabolomic, genomics, demographic), while our study creates an integrated model for disease risk assessment. We compare machine learning models such as Lasso Regression, Multi-Layer Perceptron, XG Boost, and ADA Boost to analyze multi-omics data, incorporating ROC-AUC score comparisons for various diseases and feature combinations. Additionally, we train Cox proportional hazard models for each disease to perform survival analysis. Although the integration of multi-omics data significantly improves risk prediction for 8 diseases, we find that the contribution of metabolomic data is marginal when compared to standard demographic, genetic, and biomarker features. Nonetheless, we see that metabolomics is a useful replacement for the standard biomarker panel when it is not readily available.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11042345PMC
http://dx.doi.org/10.1101/2024.04.16.589819DOI Listing

Publication Analysis

Top Keywords

multi-omics data
16
disease risk
12
machine learning
8
data
6
disease
5
risk
5
integrative machine
4
learning approaches
4
approaches predicting
4
predicting disease
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!