Motivation: Machine learning-based scoring functions (MLBSFs) have been found to exhibit inconsistent performance on different benchmarks and be prone to learning dataset bias. For the field to develop MLBSFs that learn a generalisable understanding of physics, a more rigorous understanding of how they perform is required.
Results: In this work, we compared the performance of a diverse set of popular MLBSFs (RFScore, SIGN, OnionNet-2, Pafnucy, and PointVS) to our proposed baseline models that can only learn dataset biases on a range of benchmarks.
Therapeutic antibodies are manufactured, stored and administered in the free state; this makes understanding the unbound form key to designing and improving development pipelines. Prediction of unbound antibodies is challenging, specifically modelling of the CDRH3 loop, where inaccuracies are potentially worse due to a bias in structural data towards antibody-antigen complexes. This class imbalance provides a challenge for deep learning models trained on this data, potentially limiting generalisation to unbound forms.
View Article and Find Full Text PDFCurrent strategies centred on either merging or linking initial hits from fragment-based drug design (FBDD) crystallographic screens generally do not fully leaverage 3D structural information. We show that an algorithmic approach (Fragmenstein) that 'stitches' the ligand atoms from this structural information together can provide more accurate and reliable predictions for protein-ligand complex conformation than general methods such as pharmacophore-constrained docking. This approach works under the assumption of conserved binding: when a larger molecule is designed containing the initial fragment hit, the common substructure between the two will adopt the same binding mode.
View Article and Find Full Text PDFKey functions of antibodies, such as viral neutralisation, depend on high-affinity binding. However, viral neutralisation poorly correlates with antigen affinity for reasons that have been unclear. Here, we use a new mechanistic model of bivalent binding to study >45 patient-isolated IgG1 antibodies interacting with SARS-CoV-2 RBD surfaces.
View Article and Find Full Text PDFWe introduce , an antibody variable domain diffusion model based on a general protein backbone diffusion framework, which was extended to handle multiple chains. Assessing the designability and novelty of the structures generated with our model, we find that produces highly designable antibodies that can contain novel binding regions. The backbone dihedral angles of sampled structures show good agreement with a reference antibody distribution.
View Article and Find Full Text PDF