Background: The analysis of high-throughput gene expression data with respect to sets of genes rather than individual genes has many advantages. A variety of methods have been developed for assessing the enrichment of sets of genes with respect to differential expression. In this paper we provide a comparative study of four of these methods: Fisher's exact test, Gene Set Enrichment Analysis (GSEA), Random-Sets (RS), and Gene List Analysis with Prediction Accuracy (GLAPA). The first three methods use associative statistics, while the fourth uses predictive statistics. We first compare all four methods on simulated data sets to verify that Fisher's exact test is markedly worse than the other three approaches. We then validate the other three methods on seven real data sets with known genetic perturbations and then compare the methods on two cancer data sets where our a priori knowledge is limited.
Results: The simulation study highlights that none of the three method outperforms all others consistently. GSEA and RS are able to detect weak signals of deregulation and they perform differently when genes in a gene set are both differentially up and down regulated. GLAPA is more conservative and large differences between the two phenotypes are required to allow the method to detect differential deregulation in gene sets. This is due to the fact that the enrichment statistic in GLAPA is prediction error which is a stronger criteria than classical two sample statistic as used in RS and GSEA. This was reflected in the analysis on real data sets as GSEA and RS were seen to be significant for particular gene sets while GLAPA was not, suggesting a small effect size. We find that the rank of gene set enrichment induced by GLAPA is more similar to RS than GSEA. More importantly, the rankings of the three methods share significant overlap.
Conclusion: The three methods considered in our study recover relevant gene sets known to be deregulated in the experimental conditions and pathologies analyzed. There are differences between the three methods and GSEA seems to be more consistent in finding enriched gene sets, although no method uniformly dominates over all data sets. Our analysis highlights the deep difference existing between associative and predictive methods for detecting enrichment and the use of both to better interpret results of pathway analysis. We close with suggestions for users of gene set methods.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2746222 | PMC |
http://dx.doi.org/10.1186/1471-2105-10-275 | DOI Listing |
Int J Genomics
January 2025
Department of General Medicine, Chongqing University Central Hospital, Chongqing Emergency Medical Center, Chongqing Key Laboratory of Emergency Medicine, Chongqing, China.
() is associated with the development of various stomach diseases, one of the major risk factors for stomach adenocarcinoma (STAD). The infection score between tumor and normal groups was compared by single-sample gene set enrichment analysis (ssGSEA). The key modules related to infection were identified by weighted gene coexpression network analysis (WGCNA), and functional enrichment analysis was conducted on these module genes.
View Article and Find Full Text PDFJACS Au
January 2025
CAS Key Laboratory of Tropical Marine Bio-resources and Ecology, Guangdong Key Laboratory of Marine Materia Medica, South China Sea Institute of Oceanology, Chinese Academy of Sciences, 164 West Xingang Road, Guangzhou 510301, China.
The rapid emergence of antimicrobial-resistant pathogenic microbes has accelerated the search for novel therapeutic agents. Here we report the discovery of antarmycin A (), an antibiotic containing a symmetric 16-membered macrodiolide core with two pendant vancosamine moieties, one of which is glucosylated, from deep-sea-derived SCSIO 07407. The biosynthetic gene cluster of was identified on a giant plasmid featuring transferable elements.
View Article and Find Full Text PDFFront Mol Biosci
January 2025
Faculty of Biology and Biotechnologies, National Research University Higher School of Economics, Moscow, Russia.
Introduction: Colorectal cancer (CRC) is characterized by an extremely high mortality rate, mainly caused by the high metastatic potential of this type of cancer. To date, chemotherapy remains the backbone of the treatment of metastatic colorectal cancer. Three main chemotherapeutic drugs used for the treatment of metastatic colorectal cancer are 5-fluorouracil, oxaliplatin and irinotecan which is metabolized to an active compound SN-38.
View Article and Find Full Text PDFJ Glob Infect Dis
December 2024
Leônidas and Maria Deane Institute, Oswaldo Cruz Foundation, Manaus, Amazonas State, Brazil.
Introduction: The tools to distinguish relapse from reinfection are needed in malaria-endemic areas. We evaluated seroprevalence against sets of specific peptides to the block 2 region of -merozoite surface protein-1 (PvMSP1) to detect parasite clones.
Methods: We applied amplicon deep sequencing (ADS) of block 2 region of the MSP-1 gene () to determine cocirculating parasite clones within eight -infected individuals.
Hum Genomics
January 2025
Department of Biology, Tor Vergata University of Rome, Via della Ricerca Scientifica 1, 00133, Rome, Italy.
Background: The Immunoglobulin Heavy Chain (IGH) genomic region is responsible for the production of circulating antibodies and warrants careful investigation for its association with COVID-19 characteristics. Multiple allelic variants within and across different IGH gene segments form a limited set of haplotypes. Previous studies have shown associations between some of these haplotypes and clinical outcomes of COVID-19.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!