ADAGE signature analysis: differential expression analysis with data-defined gene sets.

BMC Bioinformatics

Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA.

Published: November 2017

Background: Gene set enrichment analysis and overrepresentation analyses are commonly used methods to determine the biological processes affected by a differential expression experiment. This approach requires biologically relevant gene sets, which are currently curated manually, limiting their availability and accuracy in many organisms without extensively curated resources. New feature learning approaches can now be paired with existing data collections to directly extract functional gene sets from big data.

Results: Here we introduce a method to identify perturbed processes. In contrast with methods that use curated gene sets, this approach uses signatures extracted from public expression data. We first extract expression signatures from public data using ADAGE, a neural network-based feature extraction approach. We next identify signatures that are differentially active under a given treatment. Our results demonstrate that these signatures represent biological processes that are perturbed by the experiment. Because these signatures are directly learned from data without supervision, they can identify uncurated or novel biological processes. We implemented ADAGE signature analysis for the bacterial pathogen Pseudomonas aeruginosa. For the convenience of different user groups, we implemented both an R package (ADAGEpath) and a web server ( http://adage.greenelab.com ) to run these analyses. Both are open-source to allow easy expansion to other organisms or signature generation methods. We applied ADAGE signature analysis to an example dataset in which wild-type and ∆anr mutant cells were grown as biofilms on the Cystic Fibrosis genotype bronchial epithelial cells. We mapped active signatures in the dataset to KEGG pathways and compared with pathways identified using GSEA. The two approaches generally return consistent results; however, ADAGE signature analysis also identified a signature that revealed the molecularly supported link between the MexT regulon and Anr.

Conclusions: We designed ADAGE signature analysis to perform gene set analysis using data-defined functional gene signatures. This approach addresses an important gap for biologists studying non-traditional model organisms and those without extensive curated resources available. We built both an R package and web server to provide ADAGE signature analysis to the community.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5700673PMC
http://dx.doi.org/10.1186/s12859-017-1905-4DOI Listing

Publication Analysis

Top Keywords

adage signature
24
signature analysis
24
gene sets
16
biological processes
12
analysis
9
differential expression
8
analysis data-defined
8
gene set
8
curated resources
8
functional gene
8

Similar Publications

ADAGE signature analysis: differential expression analysis with data-defined gene sets.

BMC Bioinformatics

November 2017

Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA.

Background: Gene set enrichment analysis and overrepresentation analyses are commonly used methods to determine the biological processes affected by a differential expression experiment. This approach requires biologically relevant gene sets, which are currently curated manually, limiting their availability and accuracy in many organisms without extensively curated resources. New feature learning approaches can now be paired with existing data collections to directly extract functional gene sets from big data.

View Article and Find Full Text PDF

Maintenance of periodontal health or transition to a periodontal lesion reflects the continuous and ongoing battle between the vast microbial ecology in the oral cavity and the array of resident and emigrating inflammatory/immune cells in the periodontium. This war clearly signifies many 'battlefronts' representing the interface of the mucosal-surface cells with the dynamic biofilms composed of commensal and potential pathogenic species, as well as more recent knowledge demonstrating active invasion of cells and tissues of the periodontium leading to skirmishes in connective tissue, the locality of bone and even in the local vasculature. Research in the discipline has uncovered a concerted effort of the microbiome, using an array of survival strategies, to interact with other bacteria and host cells.

View Article and Find Full Text PDF

Cross-experiment comparisons in public data compendia are challenged by unmatched conditions and technical noise. The ADAGE method, which performs unsupervised integration with denoising autoencoder neural networks, can identify biological patterns, but because ADAGE models, like many neural networks, are over-parameterized, different ADAGE models perform equally well. To enhance model robustness and better build signatures consistent with biological pathways, we developed an ensemble ADAGE (eADAGE) that integrated stable signatures across models.

View Article and Find Full Text PDF

New targets for glycosaminoglycans and glycosaminoglycans as novel targets.

Expert Rev Proteomics

February 2013

ProtAffin Biotechnologie AG, Reininghausstrasse 13a, 8020 Graz, Austria.

Biological functions of a variety of proteins are mediated via their interaction with glycosaminoglycans (GAGs). The structural diversity within the wide GAG landscape provides individual interaction sites for a multitude of proteins involved in several pathophysiological processes. This 'GAG angle' of such proteins as well as their specific GAG ligands give rise to novel therapeutic concepts for drug development.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!