Motivation: High confidence structure prediction models have become available for nearly all protein sequences. More than 200 million AlphaFold2 models are now publicly available. We observe that there can be significant variability in the prediction confidence as judged by plDDT scores across a protein family. We have explored whether the predictions with lower plDDT in a family can be improved by the use of higher plDDT templates from the family as template structures in AlphaFold2.

Results: Our work shows that about one-third of the time structures with a low plDDT can be "rescued," moved from low to reasonable confidence. We also find that surprisingly in many cases we get a higher plDDT model when we switch off the multiple sequence alignment (MSA) option in AlphaFold2 and solely rely on a high-quality template. However, we find the best overall strategy is to make predictions both with and without the MSA information and select the model with the highest average plDDT. We also find that using high plDDT models as templates can increase the speed of AlphaFold2 as implemented in ColabFold. Additionally, we try to demonstrate that as well as having increased overall plDDT, the models are likely to have higher quality structures as judged by two metrics.

Availability And Implementation: We have implemented our pipeline in NextFlow and it is available in GitHub: https://github.com/FranceCosta/AF2Fix.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630841PMC
http://dx.doi.org/10.1093/bioadv/vbae188DOI Listing

Publication Analysis

Top Keywords

protein family
8
alphafold2 models
8
plddt
8
higher plddt
8
plddt models
8
models
5
keeping family
4
family protein
4
family
4
family templates
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!