Background: Analysis of gene expression data in terms of a priori-defined gene sets has recently received significant attention as this approach typically yields more compact and interpretable results than those produced by traditional methods that rely on individual genes. The set-level strategy can also be adopted with similar benefits in predictive classification tasks accomplished with machine learning algorithms. Initial studies into the predictive performance of set-level classifiers have yielded rather controversial results. The goal of this study is to provide a more conclusive evaluation by testing various components of the set-level framework within a large collection of machine learning experiments.

Results: Genuine curated gene sets constitute better features for classification than sets assembled without biological relevance. For identifying the best gene sets for classification, the Global test outperforms the gene-set methods GSEA and SAM-GS as well as two generic feature selection methods. To aggregate expressions of genes into a feature value, the singular value decomposition (SVD) method as well as the SetSig technique improve on simple arithmetic averaging. Set-level classifiers learned with 10 features constituted by the Global test slightly outperform baseline gene-level classifiers learned with all original data features although they are slightly less accurate than gene-level classifiers learned with a prior feature-selection step.

Conclusion: Set-level classifiers do not boost predictive accuracy, however, they do achieve competitive accuracy if learned with the right combination of ingredients.

Availability: Open-source, publicly available software was used for classifier learning and testing. The gene expression datasets and the gene set database used are also publicly available. The full tabulation of experimental results is available at http://ida.felk.cvut.cz/CESLT.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3382436PMC
http://dx.doi.org/10.1186/1471-2105-13-S10-S15DOI Listing

Publication Analysis

Top Keywords

gene expression
12
gene sets
12
set-level classifiers
12
classifiers learned
12
predictive classification
8
machine learning
8
global test
8
gene-level classifiers
8
gene
7
set-level
6

Similar Publications

TRPV4 as a Novel Regulator of Ferroptosis in Colon Adenocarcinoma: Implications for Prognosis and Therapeutic Targeting.

Dig Dis Sci

January 2025

Ningxia Medical University, Xing Qing Block, Shengli Street No.1160, Yin Chuan City, 750004, Ningxia Province, People's Republic of China.

Background: Colon adenocarcinoma (COAD) is a leading cause of cancer-related mortality worldwide. Transient receptor potential vanilloid 4 (TRPV4), a calcium-permeable non-selective cation channel, has been implicated in various cancers, including COAD. This study investigates the role of TRPV4 in colon adenocarcinoma and elucidates its potential mechanism via the ferroptosis pathway.

View Article and Find Full Text PDF

Objective: Rheumatoid arthritis (RA) is an autoimmune condition that causes severe joint deformities and impaired functionality, affecting the well-being and daily life of individuals. Consequently, there is a pressing demand for identifying viable therapeutic targets for treating RA. This study aimed to explore the molecular mechanisms of osteoclast differentiation in PBMC from patients with RA through transcriptome sequencing and bioinformatics analysis.

View Article and Find Full Text PDF

This study aimed to identify shared gene expression related to circadian rhythm disruption in polycystic ovary syndrome (PCOS) and non-alcoholic fatty liver disease (NAFLD) to discover common diagnostic biomarkers. Visceral fat RNA samples were collected from 12 PCOS and 14 non-PCOS patients, a sample size representing the clinical situation and sufficient to capture PCOS gene expression profiles. Along with liver transcriptome profiles from NAFLD patients, these data were analyzed to identify crosstalk circadian rhythm-related genes (CRRGs) between the diseases.

View Article and Find Full Text PDF

Role of immune cell homeostasis in research and treatment response in hepatocellular carcinoma.

Clin Exp Med

January 2025

Department of Thoracic Surgery, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China.

Introduction Recently, immune cells within the tumor microenvironment (TME) have become crucial in regulating cancer progression and treatment responses. The dynamic interactions between tumors and immune cells are emerging as a promising strategy to activate the host's immune system against various cancers. The development and progression of hepatocellular carcinoma (HCC) involve complex biological processes, with the role of the TME and tumor phenotypes still not fully understood.

View Article and Find Full Text PDF

We have recently shown that fluoxetine (FX) suppressed polyinosinic-polycytidylic acid-induced inflammatory response and endothelin release in human epidermal keratinocytes, via the indirect inhibition of the phosphoinositide 3-kinase (PI3K)-pathway. Because PI3K-signaling is a positive regulator of the proliferation, in the current, highly focused follow-up study, we assessed the effects of FX (14 µM) on the proliferation and differentiation of human epidermal keratinocytes. We found that FX exerted anti-proliferative actions in 2D cultures (HaCaT and primary human epidermal keratinocytes [NHEKs]; 48- and 72-h; CyQUANT-assay) as well as in 3D reconstructed epidermal equivalents (48-h; Ki-67 immunohistochemistry).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!