AI Article Synopsis

  • The study focuses on identifying functional sequence variations in DNA regulatory regions, specifically in gene promoter areas.
  • It highlights that factors like GC-content and DNA shape features are more effective at predicting functional variants than usual indicators like histone modifications or transcription factor binding sites.
  • The researchers developed a machine learning classifier called ShapeGTB, which significantly outperformed existing tools in predicting the significance of single nucleotide polymorphisms in promoter regions, suggesting that unique characteristics of mutations in these areas could lead to better variant prioritization methods in the future.

Article Abstract

Motivation: The identification of functional sequence variations in regulatory DNA regions is one of the major challenges of modern genetics. Here, we report results of a combined multifactor analysis of properties characterizing functional sequence variants located in promoter regions of genes.

Results: We demonstrate that GC-content of the local sequence fragments and local DNA shape features play significant role in prioritization of functional variants and outscore features related to histone modifications, transcription factors binding sites, or evolutionary conservation descriptors. Those observations allowed us to build specialized machine learning classifier identifying functional single nucleotide polymorphisms within promoter regions-ShapeGTB. We compared our method with more general tools predicting pathogenicity of all non-coding variants. ShapeGTB outperformed them by a wide margin (average precision 0.93 vs. 0.47-0.55). On the external validation set based on ClinVar database it displayed worse performance but was still competitive with other methods (average precision 0.47 vs. 0.23-0.42). Such results suggest unique characteristics of mutations located within promoter regions and are a promising signal for the development of more accurate variant prioritization tools in the future.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6275119PMC
http://dx.doi.org/10.7717/peerj.5742DOI Listing

Publication Analysis

Top Keywords

local dna
8
dna shape
8
prioritization functional
8
functional variants
8
machine learning
8
functional sequence
8
located promoter
8
promoter regions
8
average precision
8
functional
5

Similar Publications

Motivation: Genotyping of bi-parental populations can be performed with low-coverage next-generation sequencing (LC-NGS). This allows the creation of highly saturated genetic maps at reasonable cost, precisely localized recombination breakpoints (i.e.

View Article and Find Full Text PDF

ABCA4 Deep Intronic Variants Contributed to Nearly Half of Unsolved Stargardt Cases With a Milder Phenotype.

Invest Ophthalmol Vis Sci

January 2025

State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, China.

Purpose: The purpose of this study was to investigate the contribution and natural progression of ABCA4 deep intronic variants (DIVs) among a Chinese Stargardt disease (STGD) cohort.

Methods: For unsolved STGD probands, DIVs in ABCA4 were detected by next-generation sequencing, and splicing effects were evaluated by in silico tools and validated through minigene experiments. Comprehensive ocular examinations, especially fundus changes, were carried out and analyzed.

View Article and Find Full Text PDF

Large vertebrate genomes duplicate by activating tens of thousands of DNA replication origins, irregularly spaced along the genome. The spatial and temporal regulation of the replication process is not yet fully understood. To investigate the DNA replication dynamics, we developed a methodology called RepliCorr, which uses the spatial correlation between replication patterns observed on stretched single-molecule DNA obtained by either DNA combing or high-throughput optical mapping.

View Article and Find Full Text PDF

Background: Glioblastoma is the commonest malignant brain tumor and has a very poor prognosis. Reduced expression of the MGMT gene (10q26.3), influenced primarily by the methylation of two differentially methylated regions (DMR1 and DMR2), is associated with a good response to temozolomide treatment.

View Article and Find Full Text PDF

Background: The risk of developing advanced neoplasia (AN; colorectal cancer and/or high-grade dysplasia) in ulcerative colitis (UC) patients with a low-grade dysplasia (LGD) lesion is variable and difficult to predict. This is a major challenge for effective clinical management.

Objective: We aimed to provide accurate AN risk stratification in UC patients with LGD.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!