A large-scale benchmark of gene prioritization methods.

Sci Rep

Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden.

Published: April 2017

In order to maximize the use of results from high-throughput experimental studies, e.g. GWAS, for identification and diagnostics of new disease-associated genes, it is important to have properly analyzed and benchmarked gene prioritization tools. While prospective benchmarks are underpowered to provide statistically significant results in their attempt to differentiate the performance of gene prioritization tools, a strategy for retrospective benchmarking has been missing, and new tools usually only provide internal validations. The Gene Ontology(GO) contains genes clustered around annotation terms. This intrinsic property of GO can be utilized in construction of robust benchmarks, objective to the problem domain. We demonstrate how this can be achieved for network-based gene prioritization tools, utilizing the FunCoup network. We use cross-validation and a set of appropriate performance measures to compare state-of-the-art gene prioritization algorithms: three based on network diffusion, NetRank and two implementations of Random Walk with Restart, and MaxLink that utilizes network neighborhood. Our benchmark suite provides a systematic and objective way to compare the multitude of available and future gene prioritization tools, enabling researchers to select the best gene prioritization tool for the task at hand, and helping to guide the development of more accurate methods.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5399445PMC
http://dx.doi.org/10.1038/srep46598DOI Listing

Publication Analysis

Top Keywords

gene prioritization
28
prioritization tools
16
gene
8
prioritization
7
tools
5
large-scale benchmark
4
benchmark gene
4
prioritization methods
4
methods order
4
order maximize
4

Similar Publications

Cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL) is the most common hereditary cerebral small vessel disease caused by mutations in the NOTCH3 gene. This review highlights the increasing recognition of intracerebral hemorrhage (ICH) as a significant manifestation of CADASIL, often predominantly characterized by ischemic strokes and vascular dementia. Recent studies indicate that the prevalence of ICH in CADASIL patients ranges from 0.

View Article and Find Full Text PDF

In 2021, the Indian Undiagnosed Diseases Program was initiated for patients without a definite diagnosis despite extensive evaluation in four participating sites. Between February 2021 and March 2023, a total of 88 patients were recruited and underwent deep phenotyping. A uniform methodology for data re-analysis was implemented as the first step prior to conducting additional genomic testing.

View Article and Find Full Text PDF

Despite rapid advances in genomic sequencing, most rare genetic variants remain insufficiently characterized for clinical use, limiting the potential of personalized medicine. When classifying whether a variant is pathogenic, clinical labs adhere to diagnostic guidelines that comprehensively evaluate many forms of evidence including case data, computational predictions, and functional screening. While a substantial amount of clinical evidence has been developed for these variants, the majority cannot be definitively classified as 'pathogenic' or 'benign', and thus persist as 'Variants of Uncertain Significance' (VUS).

View Article and Find Full Text PDF

Genome-wide association studies (GWAS) of melanoma risk have identified 68 independent signals at 54 loci. For most loci, specific functional variants and their respective target genes remain to be established. Capture-HiC is an assay that links fine-mapped risk variants to candidate target genes by comprehensively mapping cell-type specific chromatin interactions.

View Article and Find Full Text PDF

Genomic characterization of Huntington's disease genetic modifiers informs drug target tractability.

Brain Commun

January 2025

Department of Pharmacology and Therapeutics, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, CanadaR3E 0T6.

Huntington's disease is caused by a CAG repeat in the gene. Repeat length correlates inversely with the age of onset but only explains part of the observed clinical variability. Genome-wide association studies highlight DNA repair genes in modifying disease onset, but further research is required to identify causal genes and evaluate their tractability as drug targets.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!