Motivation: The growing availability of genome-scale datasets has attracted increasing attention to the development of computational methods for automated inference of functional similarities among genes and their products. One class of such methods measures the functional similarity of genes based on their distance in the Gene Ontology (GO). To measure the functional relatedness of a gene set, these measures consider every pair of genes in the set, and the average of all pairwise distances is calculated. However, as more data becomes available and gene sets used for analysis become larger, such pair-based calculation becomes prohibitive.
Results: In this article, we propose GS(2) (GO-based similarity of gene sets), a novel GO-based measure of gene set similarity that is computable in linear time in the size of the gene set. The measure quantifies the similarity of the GO annotations among a set of genes by averaging the contribution of each gene's GO terms and their ancestor terms with respect to the GO vocabulary graph. To study the performance of our method, we compared our measure with an established pair-based measure when run on gene sets with varying degrees of functional similarities. In addition to a significant speed improvement, our method produced comparable similarity scores to the established method. Our method is available as a web-based tool and an open-source Python library.
Availability: The web-based tools and Python code are available at: http://bioserver.cs.rice.edu/gs2.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2672633 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btp128 | DOI Listing |
Sci Rep
January 2025
Department of Urology, The Second Hospital & Clinical Medical School, Lanzhou University, Lanzhou, 730030, People's Republic of China.
Benign prostatic hyperplasia (BPH) is a prevalent urinary system disorder. Despite evidence of a significant genetic component from previous studies, the specific pathogenic genes and biological mechanisms are still largely unknown. The study utilized the FinnGen R10 dataset, encompassing 177,901 individuals (36,601 cases and 141,300 controls), and the GTEx v8 EQTLs files to conduct single-tissue and cross-tissue transcriptome-wide association studies (TWAS).
View Article and Find Full Text PDFPathogens
January 2025
Department of Clinical Laboratory, Beijing Chest Hospital, Beijing Tuberculosis and Thoracic Tumor Institute, Capital Medical University, Beijing 101100, China.
The aim of this study was to reveal diagnostic biomarkers of considerable importance for common pathogenic , utilizing pan-genomic and comparative genome analysis to accurately characterize clinical infections. In this study, complete or assembled genome sequences of common pathogenic and closely related species were obtained from NCBI as discovery and validation sets, respectively. Genome annotation was performed using Prokka software, and pan-genomic analysis and extraction of core genes were performed using BPGA software.
View Article and Find Full Text PDFMaterials (Basel)
January 2025
Department of Civil Engineering, School of Mechanics and Engineering Science, Shanghai University, Shanghai 200444, China.
This paper establishes fatigue life prediction models using the soft computing method to address insufficient parameter consideration and limited computational accuracy in predicting the fatigue life of fiber-reinforced polymer (FRP) strengthened concrete beams. Five different input forms were proposed by collecting 117 sets of fatigue test data of FRP-strengthened concrete beams from the existing literature and integrating the outcomes from Pearson correlation analysis and significance testing. Using Gene Expression Programming (GEP), the effects of various input configurations on the accuracy of model predictions were examined.
View Article and Find Full Text PDFMedicina (Kaunas)
January 2025
Department of Internal Medicine (Nephrology), Faculty of Medicine, Ufuk University, 06510 Ankara, Turkey.
Immunoglobulin G4-related disease (IgG4-RD) is an immune-mediated, fibroinflammatory, multiorgan disease with an obscure pathogenesis. Findings indicating excessive platelet activation have been reported in systemic sclerosis, which is another autoimmune, multisystemic fibrotic disorder. The immune-mediated, inflammatory, and fibrosing intersections of IgG4-RD and systemic sclerosis raised a question about platelets' role in IgG4-RD.
View Article and Find Full Text PDFGenes (Basel)
January 2025
Department of Zoology, Faculty of Biology and Biotechnology, University of Warmia and Mazury in Olsztyn, 10-719 Olsztyn, Poland.
Interspecific hybridization between relative species (with a diploid genome designated as TT), (EE) and (NN) and the successive polyploidization with transitions from sexuality to asexuality experienced by triploid hybrids likely influence their chromosomal rearrangements, including rearrangements of ribosomal DNA (rDNA) distribution patterns. Previously, we documented distinct karyotypic differences: exhibited bi-armed chromosomes while showed uni-armed chromosomes with rDNA-positive hybridization signals, respectively. In this study, fluorescence in situ hybridization (FISH) with rDNA and rDNA probes was used to analyze and compare chromosomal distribution patterns of rDNAs in clonally reproduced triploid hybrids of different genomic constitutions ETT, ETN, EEN and EET (referred to using acronyms denoting the haploid genomes of their parent species), and their parental species.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!