Quantitative Missense Variant Effect Prediction Using Large-Scale Mutagenesis Data.

Cell Syst

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Department of Bioengineering, University of Washington, Seattle, WA 98195, USA. Electronic address:

Published: January 2018

Large datasets describing the quantitative effects of mutations on protein function are becoming increasingly available. Here, we leverage these datasets to develop Envision, which predicts the magnitude of a missense variant's molecular effect. Envision combines 21,026 variant effect measurements from nine large-scale experimental mutagenesis datasets, a hitherto untapped training resource, with a supervised, stochastic gradient boosting learning algorithm. Envision outperforms other missense variant effect predictors both on large-scale mutagenesis data and on an independent test dataset comprising 2,312 TP53 variants whose effects were measured using a low-throughput approach. This dataset was never used for hyperparameter tuning or model training and thus serves as an independent validation set. Envision prediction accuracy is also more consistent across amino acids than other predictors. Finally, we demonstrate that Envision's performance improves as more large-scale mutagenesis data are incorporated. We precompute Envision predictions for every possible single amino acid variant in human, mouse, frog, zebrafish, fruit fly, worm, and yeast proteomes (https://envision.gs.washington.edu/).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5799033PMC
http://dx.doi.org/10.1016/j.cels.2017.11.003DOI Listing

Publication Analysis

Top Keywords

large-scale mutagenesis
12
mutagenesis data
12
missense variant
8
envision
5
quantitative missense
4
variant
4
variant prediction
4
large-scale
4
prediction large-scale
4
mutagenesis
4

Similar Publications

Collections of insertional mutants have been instrumental for characterizing the functional relevance of genes in different model organisms, including Arabidopsis (Arabidopsis thaliana). However, mutations may often result in subtle phenotypes, rendering it difficult to pinpoint the function of a knocked-out gene. Here, we present a data-integrative modeling approach that enables predicting the effects of mutations on metabolic traits and plant growth.

View Article and Find Full Text PDF

Rapid diagnostic tests (RDTs) are crucial for diagnosing malaria in resource-limited settings. These tests, which detect the histidine-rich protein 2 (PfHRP2) and its structural homologue PfHRP3, are specifically designed to identify Plasmodium falciparum. Deletion of the Pfhrp2 gene in parasite has been reported in India and other malaria-endemic countries.

View Article and Find Full Text PDF

Cyclin-CDKs are master regulators of cell division. In addition to directly activating the CDK, the cyclin subunit regulates CDK specificity by binding short peptide "docking" motifs in CDK substrates. Here, we measure the relative binding strength of ~100,000 peptides to 11 human cyclins from five cyclin families (D, E, A, B and F).

View Article and Find Full Text PDF

Despite 96 million years of evolution separating humans and rodents, 11 closely related reproductive tract-specific genes in humans-, , , , , , , , , , and -and the 13 reproductive tract-specific orthologous genes in mice, form highly conserved syntenic gene clusters indicative of conserved, combined critical functions. Further, despite significant progress toward a nonhormonal male contraceptive targeting the protein encoded by one of these genes, epididymal peptidase inhibitor (EPPIN), and associations found between mutations in and an increased risk of male infertility, neither EPPIN nor any closely related whey acidic protein four-disulfide core (WFDC) gene have been explored functionally. To clarify the involvement of WFDC genes in male fertility, we strategically used CRISPR/Cas9 to generate mice lacking 13, 10, 5, or 4 genes within the cluster and demonstrated that males with deletions of 13, 10, or 4 genes (Wfdc6a, Eppin, Wfdc8, and Wfdc6a) were sterile due to an arrest in spermatogenesis, preventing formation beyond round spermatids.

View Article and Find Full Text PDF

synthesis of 1-phenethylisoquinoline in engineered .

Synth Syst Biotechnol

November 2024

CAS-Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai 200032, China.

Phenylethylisoquinoline alkaloids (PIAs) are medicinally important natural products derived from the 1-phenylethylisoquinoline precursor. Heterologous production of the PIAs remains challenging due to the incomplete elucidation of biosynthetic pathway and the lack of proper microbial cell factory designed for precursor enhancement. In this work, an artificial pathway composed of eight enzymes from different species was established for de novo 1-phenylethylisoquinoline biosynthesis in engineered .

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!