AI Article Synopsis

  • High-throughput sequencing is becoming common in identifying pathogenic mutations in genetic diseases, but predicting the mutation effects is still difficult. Researchers have developed tools like SAAPdap and SAAPpred to predict pathogenic missense variants.* -
  • The study explores the use of these tools to create a predictor capable of distinguishing between different clinical phenotypes of heart diseases linked to the MYH7 gene, specifically hypertrophic cardiomyopathy (HCM) and dilated cardiomyopathy (DCM).* -
  • Results showed that a random forest model using structural analysis achieved a 75% accuracy, which improved to 79% after refining the model, indicating that existing pathogenicity prediction methods can effectively be adapted for phenotype differentiation.*

Article Abstract

Motivation: High-throughput sequencing platforms are increasingly used to screen patients with genetic disease for pathogenic mutations, but prediction of the effects of mutations remains challenging. Previously we developed SAAPdap (Single Amino Acid Polymorphism Data Analysis Pipeline) and SAAPpred (Single Amino Acid Polymorphism Predictor) that use a combination of rule-based structural measures to predict whether a missense genetic variant is pathogenic. Here we investigate whether the same methodology can be used to develop a differential phenotype predictor, which, once a mutation has been predicted as pathogenic, is able to distinguish between phenotypes-in this case the two major clinical phenotypes (hypertrophic cardiomyopathy, HCM and dilated cardiomyopathy, DCM) associated with mutations in the beta-myosin heavy chain (MYH7) gene product (Myosin-7).

Results: A random forest predictor trained on rule-based structural analyses together with structural clustering data gave a Matthews' correlation coefficient (MCC) of 0.53 (accuracy, 75%). A post hoc removal of machine learning models that performed particularly badly, increased the performance (MCC = 0.61, Acc = 79%). This proof of concept suggests that methods used for pathogenicity prediction can be extended for use in differential phenotype prediction.

Availability And Implementation: Analyses were implemented in Perl and C and used the Java-based Weka machine learning environment. Please contact the authors for availability.

Contacts: andrew@bioinf.org.uk or andrew.martin@ucl.ac.uk

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btw362DOI Listing

Publication Analysis

Top Keywords

differential phenotype
12
effects mutations
8
beta-myosin heavy
8
heavy chain
8
single amino
8
amino acid
8
acid polymorphism
8
rule-based structural
8
machine learning
8
structural
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!