Proteins perform many essential functions in biological systems and can be successfully developed as bio-therapeutics. It is invaluable to be able to predict their properties based on a proposed sequence and structure. In this study, we developed a novel generalizable deep learning framework, LM-GVP, composed of a protein Language Model (LM) and Graph Neural Network (GNN) to leverage information from both 1D amino acid sequences and 3D structures of proteins. Our approach outperformed the state-of-the-art protein LMs on a variety of property prediction tasks including fluorescence, protease stability, and protein functions from Gene Ontology (GO). We also illustrated insights into how a GNN prediction head can inform the fine-tuning of protein LMs to better leverage structural information. We envision that our deep learning framework will be generalizable to many protein property prediction problems to greatly accelerate protein engineering and drug development.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9046255PMC
http://dx.doi.org/10.1038/s41598-022-10775-yDOI Listing

Publication Analysis

Top Keywords

deep learning
12
learning framework
12
property prediction
12
sequence structure
8
protein property
8
protein lms
8
protein
7
lm-gvp extensible
4
extensible sequence
4
structure informed
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!