XGSEA: CROSS-species gene set enrichment analysis via domain adaptation.

Brief Bioinform

School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, 710049, China.

Published: September 2021

Motivation: Gene set enrichment analysis (GSEA) has been widely used to identify gene sets with statistically significant difference between cases and controls against a large gene set. GSEA needs both phenotype labels and expression of genes. However, gene expression are assessed more often for model organisms than minor species. Also, importantly gene expression are not measured well under specific conditions for human, due to high risk of direct experiments, such as non-approved treatment or gene knockout, and then often substituted by mouse. Thus, predicting enrichment significance (on a phenotype) of a given gene set of a species (target, say human), by using gene expression measured under the same phenotype of the other species (source, say mouse) is a vital and challenging problem, which we call CROSS-species gene set enrichment problem (XGSEP).

Results: For XGSEP, we propose the CROSS-species gene set enrichment analysis (XGSEA), with three steps of: (1) running GSEA for a source species to obtain enrichment scores and $p$-values of source gene sets; (2) representing the relation between source and target gene sets by domain adaptation; and (3) using regression to predict $p$-values of target gene sets, based on the representation in (2). We extensively validated the XGSEA by using five regression and one classification measurements on four real data sets under various settings, proving that the XGSEA significantly outperformed three baseline methods in most cases. A case study of identifying important human pathways for T -cell dysfunction and reprogramming from mouse ATAC-Seq data further confirmed the reliability of the XGSEA.

Availability: Source code of the XGSEA is available through https://github.com/LiminLi-xjtu/XGSEA.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bbaa406DOI Listing

Publication Analysis

Top Keywords

gene set
24
set enrichment
16
gene sets
16
gene
14
cross-species gene
12
enrichment analysis
12
gene expression
12
domain adaptation
8
expression measured
8
target gene
8

Similar Publications

A Prognostic Riskscore Model Related to Infection in Stomach Adenocarcinoma.

Int J Genomics

January 2025

Department of General Medicine, Chongqing University Central Hospital, Chongqing Emergency Medical Center, Chongqing Key Laboratory of Emergency Medicine, Chongqing, China.

() is associated with the development of various stomach diseases, one of the major risk factors for stomach adenocarcinoma (STAD). The infection score between tumor and normal groups was compared by single-sample gene set enrichment analysis (ssGSEA). The key modules related to infection were identified by weighted gene coexpression network analysis (WGCNA), and functional enrichment analysis was conducted on these module genes.

View Article and Find Full Text PDF

Antarmycins: Discovery, Biosynthesis, Anti-pathogenic Bacterial Activity, and Mechanism of Action from Deep-Sea-Derived .

JACS Au

January 2025

CAS Key Laboratory of Tropical Marine Bio-resources and Ecology, Guangdong Key Laboratory of Marine Materia Medica, South China Sea Institute of Oceanology, Chinese Academy of Sciences, 164 West Xingang Road, Guangzhou 510301, China.

The rapid emergence of antimicrobial-resistant pathogenic microbes has accelerated the search for novel therapeutic agents. Here we report the discovery of antarmycin A (), an antibiotic containing a symmetric 16-membered macrodiolide core with two pendant vancosamine moieties, one of which is glucosylated, from deep-sea-derived SCSIO 07407. The biosynthetic gene cluster of was identified on a giant plasmid featuring transferable elements.

View Article and Find Full Text PDF

Introduction: Colorectal cancer (CRC) is characterized by an extremely high mortality rate, mainly caused by the high metastatic potential of this type of cancer. To date, chemotherapy remains the backbone of the treatment of metastatic colorectal cancer. Three main chemotherapeutic drugs used for the treatment of metastatic colorectal cancer are 5-fluorouracil, oxaliplatin and irinotecan which is metabolized to an active compound SN-38.

View Article and Find Full Text PDF

Introduction: The tools to distinguish relapse from reinfection are needed in malaria-endemic areas. We evaluated seroprevalence against sets of specific peptides to the block 2 region of -merozoite surface protein-1 (PvMSP1) to detect parasite clones.

Methods: We applied amplicon deep sequencing (ADS) of block 2 region of the MSP-1 gene () to determine cocirculating parasite clones within eight -infected individuals.

View Article and Find Full Text PDF

Background: The Immunoglobulin Heavy Chain (IGH) genomic region is responsible for the production of circulating antibodies and warrants careful investigation for its association with COVID-19 characteristics. Multiple allelic variants within and across different IGH gene segments form a limited set of haplotypes. Previous studies have shown associations between some of these haplotypes and clinical outcomes of COVID-19.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!