MOGSA: Integrative Single Sample Gene-set Analysis of Multiple Omics Data.

Mol Cell Proteomics

Department of Data Science, Division of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts 02215; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02215. Electronic address:

Published: August 2019

Gene-set analysis (GSA) summarizes individual molecular measurements to more interpretable pathways or gene-sets and has become an indispensable step in the interpretation of large-scale omics data. However, GSA methods are limited to the analysis of single omics data. Here, we introduce a new computation method termed multi-omics gene-set analysis (MOGSA), a multivariate single sample gene-set analysis method that integrates multiple experimental and molecular data types measured over the same set of samples. The method learns a low dimensional representation of most variant correlated features (genes, proteins, etc.) across multiple omics data sets, transforms the features onto the same scale and calculates an integrated gene-set score from the most informative features in each data type. MOGSA does not require filtering data to the intersection of features (gene IDs), therefore, all molecular features, including those that lack annotation may be included in the analysis. Using simulated data, we demonstrate that integrating multiple diverse sources of molecular data increases the power to discover subtle changes in gene-sets and may reduce the impact of unreliable information in any single data type. Using real experimental data, we demonstrate three use-cases of MOGSA. First, we show how to remove a source of noise (technical or biological) in integrative MOGSA of NCI60 transcriptome and proteome data. Second, we apply MOGSA to discover similarities and differences in mRNA, protein and phosphorylation profiles of a small study of stem cell lines and assess the influence of each data type or feature on the total gene-set score. Finally, we apply MOGSA to cluster analysis and show that three molecular subtypes are robustly discovered when copy number variation and mRNA data of 308 bladder cancers from The Cancer Genome Atlas are integrated using MOGSA. MOGSA is available in the Bioconductor R package "mogsa."

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6692785PMC
http://dx.doi.org/10.1074/mcp.TIR118.001251DOI Listing

Publication Analysis

Top Keywords

gene-set analysis
16
omics data
16
data
14
data type
12
mogsa
9
single sample
8
sample gene-set
8
multiple omics
8
molecular data
8
gene-set score
8

Similar Publications

Hemorrhagic stroke is a known complication of glioma, yet the underlying mechanisms remain poorly understood. This study aims to investigate key biomarkers of glioma-related hemorrhage to provide insights into glioma molecular therapies. Data were obtained from the Gene Expression Omnibus (GEO) and the Cancer Genome Atlas (TCGA) databases to analyze differentially expressed genes (DEGs) in glioma by contrasting glioblastoma (GBM) with low-grade gliomas (LGGs).

View Article and Find Full Text PDF

Objective: Juvenile dermatomyositis (JDM) is a complex autoimmune disease, and its pathogenesis remains poorly understood. Building upon previous research on the immunological and inflammatory aspects of JDM, this study aims to investigate the role of pyroptosis in the pathogenesis of JDM using a comprehensive bioinformatics approach.

Methods: Two microarray datasets (GSE3307 and GSE11971) were obtained from the Gene Expression Omnibus database, and a list of 62 pyroptosis-related genes was compiled.

View Article and Find Full Text PDF

Introduction: Gastric cancer (GC) is among the deadliest malignancies globally, characterized by hypoxia-driven pathways that promote cancer progression, including stemness mechanisms facilitating invasion and metastasis. This study aimed to develop a prognostic decision tree using genes implicated in hypoxia and stemness pathways to predict outcomes in GC patients.

Materials And Methods: GC RNA-seq data from The Cancer Genome Atlas (TCGA) were analyzed to compute hypoxia and stemness scores using Gene Set Variation Analysis (GSVA) and the mRNA expression-based stemness index (mRNAsi).

View Article and Find Full Text PDF

Background: Spinal cord injury (SCI) triggers a complex inflammatory response that impedes neural repair and functional recovery. The modulation of macrophage phenotypes is thus considered a promising therapeutic strategy to mitigate inflammation and promote regeneration.

Methods: We employed microarray and single-cell RNA sequencing (scRNA-seq) to investigate gene expression changes and immune cell dynamics in mice following crush injury at 3 and 7 days post-injury (dpi).

View Article and Find Full Text PDF

Background: Acute myeloid leukemia (AML) is an aggressive hematological neoplasm. Little improvement in survival rates has been achieved over the past few decades. Necroptosis has relationship with certain types of malignancies outcomes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!