Performing a complete deep mutational scan with all single point mutations may not be practical, and may not even be required, especially if predictive computational models can be developed. Computational models are however naive to cellular response in the myriads of assay-conditions. In a realistic paradigm of assay context-aware predictive hybrid models that combine minimal experimental data from deep mutational scans with structure, sequence information and computational models, we define and evaluate different strategies for choosing this minimal set. We evaluated the trivial strategy of a systematic reduction in the number of mutational studies from 85% to 15%, along with several others about the choice of the types of mutations such as random versus site-directed with the same 15% data completeness. Interestingly, the predictive capabilities by training on a random set of mutations and using a systematic substitution of all amino acids to alanine, asparagine and histidine (ANH) were comparable. Another strategy we explored, augmenting the training data with measurements of the same mutants at multiple assay conditions, did not improve the prediction quality. For the six proteins we analyzed, the bin-wise error in prediction is optimal when 50-100 mutations per bin are used in training the computational model, suggesting that good prediction quality may be achieved with a library of 500-1000 mutations.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6954071PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0227621PLOS

Publication Analysis

Top Keywords

deep mutational
12
computational models
12
mutational scan
8
prediction quality
8
mutational
5
computational
5
mutations
5
deep2full evaluating
4
evaluating strategies
4
strategies selecting
4

Similar Publications

Pathogenic activating mutations in the fibroblast growth factor receptor 3 (FGFR3) drive disease maintenance and progression in urothelial cancer. 10-15% of muscle-invasive and metastatic urothelial cancer (MIBC/mUC) are FGFR3-mutant. Selective targeting of FGFR3 hotspot mutations with tyrosine kinase inhibitors (e.

View Article and Find Full Text PDF

Background: The aim of this study is to develop deep learning models based on F-fluorodeoxyglucose positron emission tomography/computed tomographic (F-FDG PET/CT) images for predicting individual epidermal growth factor receptor () mutation status in lung adenocarcinoma (LUAD).

Methods: We enrolled 430 patients with non-small-cell lung cancer from two institutions in this study. The advanced Inception V3 model to predict EGFR mutations based on PET/CT images and developed CT, PET, and PET + CT models was used.

View Article and Find Full Text PDF

DYT-THAP1 dystonia is a monogenetic form of dystonia, a movement disorder characterized by the involuntary co-contraction of agonistic and antagonistic muscles. The disease is caused by mutations in the THAP1 gene, although the precise mechanisms by which these mutations contribute to the pathophysiology of dystonia remain unclear. The incomplete penetrance of DYT-THAP1 dystonia, estimated at 40 to 60 %, suggests that an environmental trigger may be required for the manifestation of the disease in genetically predisposed individuals.

View Article and Find Full Text PDF

Digital PCR (dPCR) has transformed nucleic acid diagnostics by enabling the absolute quantification of rare mutations and target sequences. However, traditional dPCR detection methods, such as those involving flow cytometry and fluorescence imaging, may face challenges due to high costs, complexity, limited accuracy, and slow processing speeds. In this study, SAM-dPCR is introduced, a training-free open-source bioanalysis paradigm that offers swift and precise absolute quantification of biological samples.

View Article and Find Full Text PDF

Genetic improvement of low-lignin poplars: a new strategy based on molecular recognition, chemical reactions and empirical breeding.

Physiol Plant

December 2024

Laboratory of Tumor Targeted and Immune Therapy, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center for Biotherapy, Chengdu, China.

As an important source of pollution in the papermaking process, the presence of lignin in poplar can seriously affect the quality and process of pulping. During lignin synthesis, Caffeoyl-CoA-O methyltransferase (CCoAOMT), as a specialized catalytic transferase, can effectively regulate the methylation of caffeoyl-coenzyme A (CCoA) to feruloyl-coenzyme A. Targeting CCoAOMT, this study investigated the substrate recognition mechanism and the possible reaction mechanism, the key residues of lignin binding were mutated and the lignin content was validated by deep convolutional neural-network model based on genome-wide prediction (DCNGP).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!