GS2: an efficiently computable measure of GO-based similarity of gene sets.

Bioinformatics

Department of Computer Science, Rice University, 6100 Main Street, MS 132, Houston, TX, USA.

Published: May 2009

Motivation: The growing availability of genome-scale datasets has attracted increasing attention to the development of computational methods for automated inference of functional similarities among genes and their products. One class of such methods measures the functional similarity of genes based on their distance in the Gene Ontology (GO). To measure the functional relatedness of a gene set, these measures consider every pair of genes in the set, and the average of all pairwise distances is calculated. However, as more data becomes available and gene sets used for analysis become larger, such pair-based calculation becomes prohibitive.

Results: In this article, we propose GS(2) (GO-based similarity of gene sets), a novel GO-based measure of gene set similarity that is computable in linear time in the size of the gene set. The measure quantifies the similarity of the GO annotations among a set of genes by averaging the contribution of each gene's GO terms and their ancestor terms with respect to the GO vocabulary graph. To study the performance of our method, we compared our measure with an established pair-based measure when run on gene sets with varying degrees of functional similarities. In addition to a significant speed improvement, our method produced comparable similarity scores to the established method. Our method is available as a web-based tool and an open-source Python library.

Availability: The web-based tools and Python code are available at: http://bioserver.cs.rice.edu/gs2.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2672633PMC
http://dx.doi.org/10.1093/bioinformatics/btp128DOI Listing

Publication Analysis

Top Keywords

gene sets
16
gene set
12
go-based similarity
8
gene
8
similarity gene
8
functional similarities
8
measure gene
8
measure
6
similarity
6
set
5

Similar Publications

A cross-tissue transcriptome-wide association study identifies new susceptibility genes for benign prostatic hyperplasia.

Sci Rep

January 2025

Department of Urology, The Second Hospital & Clinical Medical School, Lanzhou University, Lanzhou, 730030, People's Republic of China.

Benign prostatic hyperplasia (BPH) is a prevalent urinary system disorder. Despite evidence of a significant genetic component from previous studies, the specific pathogenic genes and biological mechanisms are still largely unknown. The study utilized the FinnGen R10 dataset, encompassing 177,901 individuals (36,601 cases and 141,300 controls), and the GTEx v8 EQTLs files to conduct single-tissue and cross-tissue transcriptome-wide association studies (TWAS).

View Article and Find Full Text PDF

Discovery of Novel Diagnostic Biomarkers for Common Pathogenic Through Pan-Genome and Comparative Genome Analysis, with Preliminary Validation.

Pathogens

January 2025

Department of Clinical Laboratory, Beijing Chest Hospital, Beijing Tuberculosis and Thoracic Tumor Institute, Capital Medical University, Beijing 101100, China.

The aim of this study was to reveal diagnostic biomarkers of considerable importance for common pathogenic , utilizing pan-genomic and comparative genome analysis to accurately characterize clinical infections. In this study, complete or assembled genome sequences of common pathogenic and closely related species were obtained from NCBI as discovery and validation sets, respectively. Genome annotation was performed using Prokka software, and pan-genomic analysis and extraction of core genes were performed using BPGA software.

View Article and Find Full Text PDF

Fatigue Life Prediction of FRP-Strengthened Reinforced Concrete Beams Based on Soft Computing Techniques.

Materials (Basel)

January 2025

Department of Civil Engineering, School of Mechanics and Engineering Science, Shanghai University, Shanghai 200444, China.

This paper establishes fatigue life prediction models using the soft computing method to address insufficient parameter consideration and limited computational accuracy in predicting the fatigue life of fiber-reinforced polymer (FRP) strengthened concrete beams. Five different input forms were proposed by collecting 117 sets of fatigue test data of FRP-strengthened concrete beams from the existing literature and integrating the outcomes from Pearson correlation analysis and significance testing. Using Gene Expression Programming (GEP), the effects of various input configurations on the accuracy of model predictions were examined.

View Article and Find Full Text PDF

Immunoglobulin G4-related disease (IgG4-RD) is an immune-mediated, fibroinflammatory, multiorgan disease with an obscure pathogenesis. Findings indicating excessive platelet activation have been reported in systemic sclerosis, which is another autoimmune, multisystemic fibrotic disorder. The immune-mediated, inflammatory, and fibrosing intersections of IgG4-RD and systemic sclerosis raised a question about platelets' role in IgG4-RD.

View Article and Find Full Text PDF

Interspecific hybridization between relative species (with a diploid genome designated as TT), (EE) and (NN) and the successive polyploidization with transitions from sexuality to asexuality experienced by triploid hybrids likely influence their chromosomal rearrangements, including rearrangements of ribosomal DNA (rDNA) distribution patterns. Previously, we documented distinct karyotypic differences: exhibited bi-armed chromosomes while showed uni-armed chromosomes with rDNA-positive hybridization signals, respectively. In this study, fluorescence in situ hybridization (FISH) with rDNA and rDNA probes was used to analyze and compare chromosomal distribution patterns of rDNAs in clonally reproduced triploid hybrids of different genomic constitutions ETT, ETN, EEN and EET (referred to using acronyms denoting the haploid genomes of their parent species), and their parental species.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!