An integrative modular approach to systematically predict gene-phenotype associations.

BMC Bioinformatics

Program in Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles CA 90089, USA.

Published: January 2010

Background: Complex human diseases are often caused by multiple mutations, each of which contributes only a minor effect to the disease phenotype. To study the basis for these complex phenotypes, we developed a network-based approach to identify coexpression modules specifically activated in particular phenotypes. We integrated these modules, protein-protein interaction data, Gene Ontology annotations, and our database of gene-phenotype associations derived from literature to predict novel human gene-phenotype associations. Our systematic predictions provide us with the opportunity to perform a global analysis of human gene pleiotropy and its underlying regulatory mechanisms.

Results: We applied this method to 338 microarray datasets, covering 178 phenotype classes, and identified 193,145 phenotype-specific coexpression modules. We trained random forest classifiers for each phenotype and predicted a total of 6,558 gene-phenotype associations. We showed that 40.9% genes are pleiotropic, highlighting that pleiotropy is more prevalent than previously expected. We collected 77 ChIP-chip datasets studying 69 transcription factors binding over 16,000 targets under various phenotypic conditions. Utilizing this unique data source, we confirmed that dynamic transcriptional regulation is an important force driving the formation of phenotype specific gene modules.

Conclusion: We created a genome-wide gene to phenotype mapping that has many potential implications, including providing potential new drug targets and uncovering the basis for human disease phenotypes. Our analysis of these phenotype-specific coexpression modules reveals a high prevalence of gene pleiotropy, and suggests that phenotype-specific transcription factor binding may contribute to phenotypic diversity. All resources from our study are made freely available on our online Phenotype Prediction Database.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009536PMC
http://dx.doi.org/10.1186/1471-2105-11-S1-S62DOI Listing

Publication Analysis

Top Keywords

gene-phenotype associations
16
coexpression modules
12
gene pleiotropy
8
phenotype-specific coexpression
8
phenotype
6
gene
5
integrative modular
4
modular approach
4
approach systematically
4
systematically predict
4

Similar Publications

Objective: 3p deletion syndrome is a rare monosomal disease that encompasses deletions throughout the short arm of chromosome 3. It is often in the distal region (3p25-pter), but variations in breakpoints and a complex clinical manifestation exist, with congenital heart defects being considered rare. We present the first case of hypoplastic left heart syndrome and minor dysmorphic features associated with 3p- syndrome.

View Article and Find Full Text PDF

Organophosphate and pyrethroid pesticides are common contaminants in cannabis. Due to the status of cannabis as an illicit Schedule I substance at the federal level, there are no unified national guidelines in the U.S.

View Article and Find Full Text PDF

Background: Mining functional gene modules from genomic data is an important step to detect gene members of pathways or other relations such as protein-protein interactions. This work explores the plausibility of detecting functional gene modules by factorizing gene-phenotype association matrix from the phenotype ontology data rather than the conventionally used gene expression data. Recently, the hierarchical structure of phenotype ontologies has not been sufficiently utilized in gene clustering while functionally related genes are consistently associated with phenotypes on the same path in phenotype ontologies.

View Article and Find Full Text PDF

Six genetic variants are associated with cardiovascular disease independently from canonical risk factors: a new method to refine GWAS results based on the UKBiobank phenotype database.

Mol Genet Genomics

December 2024

Department of Health Promotion, Maternal and Child Care, Internal Medicine and Medical Specialities "G. D'Alessandro" (PROMISE), University of Palermo, Via del Vespro 129, Palermo, 90127, Italy.

Article Synopsis
  • This paper presents a new method using GWAS filtering to identify novel phenotypes associated with genetic loci, focusing on cardiovascular disease (CVD) using UK Biobank data.
  • The study employs an automated routine to analyze associations between various phenotypes and single nucleotide polymorphisms (SNPs), identifying six gene variants linked to CVD that work independently of known risk factors.
  • The research not only highlights new gene-phenotype associations but also explores potential mechanisms explaining how these genetic variants contribute to cardiovascular disease.
View Article and Find Full Text PDF

Background And Aims: Stroke is a leading cause of mortality and morbidity in Bangladesh. It is estimated that genetic determinants account for around 40%-60% of its etiology, similar to environmental factors. This study aimed to provide a better understanding of the genetic, environmental, and clinical risk factors in stroke patients from Bangladesh.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!