Predicting gene ontology from a global meta-analysis of 1-color microarray experiments.

BMC Bioinformatics

Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation 825 NE 13th Street, Oklahoma City, Oklahoma 73104-5005, USA.

Published: October 2011

Background: Global meta-analysis (GMA) of microarray data to identify genes with highly similar co-expression profiles is emerging as an accurate method to predict gene function and phenotype, even in the absence of published data on the gene(s) being analyzed. With a third of human genes still uncharacterized, this approach is a promising way to direct experiments and rapidly understand the biological roles of genes. To predict function for genes of interest, GMA relies on a guilt-by-association approach to identify sets of genes with known functions that are consistently co-expressed with it across different experimental conditions, suggesting coordinated regulation for a specific biological purpose. Our goal here is to define how sample, dataset size and ranking parameters affect prediction performance.

Results: 13,000 human 1-color microarrays were downloaded from GEO for GMA analysis. Prediction performance was benchmarked by calculating the distance within the Gene Ontology (GO) tree between predicted function and annotated function for sets of 100 randomly selected genes. We find the number of new predicted functions rises as more datasets are added, but begins to saturate at a sample size of approximately 2,000 experiments. For the gene set used to predict function, we find precision to be higher with smaller set sizes, yet with correspondingly poor recall and, as set size is increased, recall and F-measure also tend to increase but at the cost of precision.

Conclusions: Of the 20,813 genes expressed in 50 or more experiments, at least one predicted GO category was found for 72.5% of them. Of the 5,720 genes without GO annotation, 4,189 had at least one predicted ontology using top 40 co-expressed genes for prediction analysis. For the remaining 1,531 genes without GO predictions or annotations, ~17% (257 genes) had sufficient co-expression data yet no statistically significantly overrepresented ontologies, suggesting their regulation may be more complex.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3236836PMC
http://dx.doi.org/10.1186/1471-2105-12-S10-S14DOI Listing

Publication Analysis

Top Keywords

genes
12
gene ontology
8
global meta-analysis
8
predict function
8
function
5
predicting gene
4
ontology global
4
meta-analysis 1-color
4
1-color microarray
4
experiments
4

Similar Publications

BACKGROUND Limb-girdle muscular dystrophy recessive 1 (LGMDR1) is an autosomal recessive degenerative muscle disorder characterized by progressive muscular weakness caused by pathogenic variants in the CAPN3 gene. Desmoplastic small round cell tumors (DSRCT) are ultra-rare and aggressive soft tissue sarcomas usually in the abdominal cavity, molecularly characterized by the presence of a EWSR1::WT1 fusion transcript. Mouse models of muscular dystrophy, including LGMDR1, present an increased risk of soft tissue sarcomas.

View Article and Find Full Text PDF

Atherosclerotic vascular changes can begin during childhood, providing risk for cardiovascular disease (CVD) in adulthood. Identifiable risk factors such as dyslipidemia accelerate this process for some children. The apolipoprotein B (APOB) gene could help explain the inter-individual variability in lipid levels among young individuals and identify groups that require greater attention to prevent CVD.

View Article and Find Full Text PDF

Optimizing T cell inflamed signature through a combination biomarker approach for predicting immunotherapy response in NSCLC.

Sci Rep

December 2024

Interventional Oncology, Johnson & Johnson Enterprise Innovation, Inc, 10th Floor 255 Main St, 02142, Cambridge, Boston, MA, USA.

The introduction of anti-PD-1/PD-L1 therapies revolutionized treatment for advanced non-small cell lung cancer (NSCLC), yet response rates remain modest, underscoring the need for predictive biomarkers. While a T cell inflamed gene expression profile (GEP) has predicted anti-PD-1 response in various cancers, it failed in a large NSCLC cohort from the Stand Up To Cancer-Mark (SU2C-MARK) Foundation. Re-analysis revealed that while the T cell inflamed GEP alone was not predictive, its performance improved significantly when combined with gene signatures of myeloid cell markers.

View Article and Find Full Text PDF

Identification and validation of up-regulated TNFAIP6 in osteoarthritis with type 2 diabetes mellitus.

Sci Rep

December 2024

Division of Joint Surgery and Sports Medicine, Department of Orthopedic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China.

Lines of evidence have indicated that type 2 diabetes mellitus (T2DM) is an independent risk factor for osteoarthritis (OA) progression. However, the study focused on the relationship between T2DM and OA at the transcriptional level remains empty. We downloaded OA- and T2DM-related bulk RNA-sequencing and single-cell RNA sequencing data from the Gene Expression Omnibus (GEO) dataset.

View Article and Find Full Text PDF

The western corn rootworm (WCR), Diabrotica virgifera virgifera LeConte, has evolved resistance to nearly every management tactic utilized in the field. This study investigated the resistance mechanisms in a WCR strain resistant to the Bacillus thuringiensis (Bt) protein eCry3.1Ab using dsRNA to knockdown WCR midgut genes previously documented to be associated with the resistance.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!