Large-scale design and refinement of stable proteins using sequence-only models.

PLoS One

Department of Electrical and Computer Engineering, University of Washington, Seattle, Washington, United States of America.

Published: April 2022

Engineered proteins generally must possess a stable structure in order to achieve their designed function. Stable designs, however, are astronomically rare within the space of all possible amino acid sequences. As a consequence, many designs must be tested computationally and experimentally in order to find stable ones, which is expensive in terms of time and resources. Here we use a high-throughput, low-fidelity assay to experimentally evaluate the stability of approximately 200,000 novel proteins. These include a wide range of sequence perturbations, providing a baseline for future work in the field. We build a neural network model that predicts protein stability given only sequences of amino acids, and compare its performance to the assayed values. We also report another network model that is able to generate the amino acid sequences of novel stable proteins given requested secondary sequences. Finally, we show that the predictive model-despite weaknesses including a noisy data set-can be used to substantially increase the stability of both expert-designed and model-generated proteins.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8920274	PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0265020	PLOS

Publication Analysis

Top Keywords

stable proteins

amino acid

acid sequences

network model

stable

proteins

large-scale design

design refinement

refinement stable

proteins sequence-only

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!