Quantification of biases in predictions of protein-protein binding affinity changes upon mutations.

Brief Bioinform

Computational Biology and Bioinformatics, Université Libre de Bruxelles, Roosevelt Ave, 1050, Brussels, Belgium.

Published: November 2023

AI Article Synopsis

  • Understanding mutations in protein-protein binding affinity is crucial for biotechnology and understanding diseases, leading to the development of various computational prediction methods over the last decade.
  • Many of these methods claim high accuracy, but the review of eight predictors reveals significant biases and limitations in their generalizability, particularly with unseen mutations.
  • Our findings suggest that while physics-based methods are more resilient, there is a pressing need for improvement in prediction models to enhance their reliability and consider potential biases, especially toward destabilizing mutations.

Article Abstract

Understanding the impact of mutations on protein-protein binding affinity is a key objective for a wide range of biotechnological applications and for shedding light on disease-causing mutations, which are often located at protein-protein interfaces. Over the past decade, many computational methods using physics-based and/or machine learning approaches have been developed to predict how protein binding affinity changes upon mutations. They all claim to achieve astonishing accuracy on both training and test sets, with performances on standard benchmarks such as SKEMPI 2.0 that seem overly optimistic. Here we benchmarked eight well-known and well-used predictors and identified their biases and dataset dependencies, using not only SKEMPI 2.0 as a test set but also deep mutagenesis data on the severe acute respiratory syndrome coronavirus 2 spike protein in complex with the human angiotensin-converting enzyme 2. We showed that, even though most of the tested methods reach a significant degree of robustness and accuracy, they suffer from limited generalizability properties and struggle to predict unseen mutations. Interestingly, the generalizability problems are more severe for pure machine learning approaches, while physics-based methods are less affected by this issue. Moreover, undesirable prediction biases toward specific mutation properties, the most marked being toward destabilizing mutations, are also observed and should be carefully considered by method developers. We conclude from our analyses that there is room for improvement in the prediction models and suggest ways to check, assess and improve their generalizability and robustness.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10777193PMC
http://dx.doi.org/10.1093/bib/bbad491DOI Listing

Publication Analysis

Top Keywords

binding affinity
12
protein-protein binding
8
affinity changes
8
changes mutations
8
machine learning
8
learning approaches
8
mutations
6
quantification biases
4
biases predictions
4
predictions protein-protein
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!