Application of Unsupervised Multi-Omic Factor Analysis to Uncover Patterns of Variation and Molecular Processes Linked to Cardiovascular Disease.

J Vis Exp

Institute of Computational Biology, German Research Center for Environmental Health, Helmholtz Zentrum München; Department of Computer Science, TUM School of Computation, Information and Technology, Technical University of Munich; German Centre for Cardiovascular Research (DZHK), partner site Munich Heart Alliance;

Published: September 2024

Disease mechanisms are usually complex and governed by the interaction of several distinct molecular processes. Complex, multidimensional datasets are a valuable resource to generate more insights into those processes, but the analysis of such datasets can be challenging due to the high dimensionality resulting, for example, from different disease conditions, timepoints, and omics capturing the process at different resolutions. Here, we showcase an approach to analyze and explore such a complex multiomics dataset in an unsupervised way by applying multi-omics factor analysis (MOFA) to a dataset generated from blood samples that capture the immune response in acute and chronic coronary syndromes. The dataset consists of several assays at differing resolutions, including sample-level cytokine data, plasma-proteomics and neutrophil prime-seq, and single-cell RNA-seq (scRNA-seq) data. Further complexity is added by having several different time points measured per patient and several patient subgroups. The analysis workflow outlines how to integrate and analyze the data in several steps: (1) Data pre-processing and harmonization, (2) Estimation of the MOFA model, (3) Downstream analysis. Step 1 outlines how to process the features of the different data types, filter out low-quality features, and normalize them to harmonize their distributions for further analysis. Step 2 shows how to apply the MOFA model and explore the major sources of variance within the dataset across all omics and features. Step 3 presents several strategies for the downstream analysis of the captured patterns, linking them to the disease conditions and potential molecular processes governing those conditions. Overall, we present a workflow for unsupervised data exploration of complex multi-omics datasets to enable the identification of major axes of variation composed of differing molecular features that can also be applied to other contexts and multi-omics datasets (including other assays as presented in the exemplary use case).

Download full-text PDF

Source
http://dx.doi.org/10.3791/66659DOI Listing

Publication Analysis

Top Keywords

molecular processes
12
factor analysis
8
disease conditions
8
mofa model
8
downstream analysis
8
analysis step
8
multi-omics datasets
8
analysis
7
data
5
application unsupervised
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!