Publications by authors named "Iain Moal"

Motivation: The versatile binding properties of antibodies have made them an extremely important class of biotherapeutics. However, therapeutic antibody development is a complex, expensive, and time-consuming task, with the final antibody needing to not only have strong and specific binding but also be minimally impacted by developability issues. The success of transformer-based language models in protein sequence space and the availability of vast amounts of antibody sequences, has led to the development of many antibody-specific language models to help guide antibody design.

View Article and Find Full Text PDF

Antibodies with similar amino acid sequences, especially across their complementarity-determining regions, often share properties. Finding that an antibody of interest has a similar sequence to naturally expressed antibodies in healthy or diseased repertoires is a powerful approach for the prediction of antibody properties, such as immunogenicity or antigen specificity. However, as the number of available antibody sequences is now in the billions and continuing to grow, repertoire mining for similar sequences has become increasingly computationally expensive.

View Article and Find Full Text PDF

Motivation: General protein language models have been shown to summarize the semantics of protein sequences into representations that are useful for state-of-the-art predictive methods. However, for antibody specific problems, such as restoring residues lost due to sequencing errors, a model trained solely on antibodies may be more powerful. Antibodies are one of the few protein types where the volume of sequence data needed for such language models is available, e.

View Article and Find Full Text PDF

Summary: The development of new vaccines and antibody therapeutics typically takes several years and requires over $1bn in investment. Accurate knowledge of the paratope (antibody binding site) can speed up and reduce the cost of this process by improving our understanding of antibody-antigen binding. We present Paragraph, a structure-based paratope prediction tool that outperforms current state-of-the-art tools using simpler feature vectors and no antigen information.

View Article and Find Full Text PDF

Accurate predictive modeling of antibody-antigen complex structures and structure-based antibody design remain major challenges in computational biology, with implications for biotherapeutics, immunity, and vaccines. Through a systematic search for high-resolution structures of antibody-antigen complexes and unbound antibody and antigen structures, in conjunction with identification of experimentally determined binding affinities, we have assembled a non-redundant set of test cases for antibody-antigen docking and affinity prediction. This benchmark more than doubles the number of antibody-antigen complexes and corresponding affinities available in our previous benchmarks, providing an unprecedented view of the determinants of antibody recognition and insights into molecular flexibility.

View Article and Find Full Text PDF

Many of the biological functions of the cell are driven by protein-protein interactions. However, determining which proteins interact and exactly how they do so to enable their functions, remain major research questions. Functional interactions are dependent on a number of complicated factors; therefore, modeling the three-dimensional structure of protein-protein complexes is still considered a complex endeavor.

View Article and Find Full Text PDF
Article Synopsis
  • The CAPRI Round 46 involved 20 protein assembly targets, blending 14 homo-oligomers with 6 heterocomplexes, highlighting challenges in modeling.
  • A significant number of models (~2000 per target) were submitted by about 30 teams, with better performance seen in easier targets but struggles with complex compositions, as evidenced by only 3 out of 11 difficult targets yielding medium to high-quality models.
  • Analysis revealed a decline in prediction quality for binding interface residues compared to previous rounds, pointing to areas needing improvement for future challenges.
View Article and Find Full Text PDF
Article Synopsis
  • Researchers developed a new model for amino acid sequence evolution that incorporates protein structure, which is often overlooked despite its importance.
  • This "structurally aware" model uses an expanded alphabet to describe amino acids along with their side-chain configurations, taking into account geometric patterns and dihedral angles.
  • The new model outperforms traditional models in estimating evolutionary divergence and reconstructing ancestral states, highlighting the significance of side-chain geometry for understanding protein folding and function in evolutionary biology.
View Article and Find Full Text PDF

Motivation: Understanding the relationship between the sequence, structure, binding energy, binding kinetics and binding thermodynamics of protein-protein interactions is crucial to understanding cellular signaling, the assembly and regulation of molecular complexes, the mechanisms through which mutations lead to disease, and protein engineering.

Results: We present SKEMPI 2.0, a major update to our database of binding free energy changes upon mutation for structurally resolved protein-protein interactions.

View Article and Find Full Text PDF

The atomic structures of protein complexes can provide useful information for drug design, protein engineering, systems biology, and understanding pathology. Obtaining this information experimentally can be challenging. However, if the structures of the subunits are known, then it is often possible to model the complex computationally.

View Article and Find Full Text PDF

Protein-protein interactions play fundamental roles in biological processes including signaling, metabolism, and trafficking. While the structure of a protein complex reveals crucial details about the interaction, it is often difficult to acquire this information experimentally. As the number of interactions discovered increases faster than they can be characterized, protein-protein docking calculations may be able to reduce this disparity by providing models of the interacting proteins.

View Article and Find Full Text PDF

Many proteins can adopt multiple distinct conformational states which often play different functional roles. Previous studies have shown that the underlying global dynamics through which these states are accessed are, at least in part, encoded by the protein's topology. In this work we present a method for generating transition pathways between states by perturbing the protein toward a target conformational state along thermally accessible collective motions calculated from the starting conformation.

View Article and Find Full Text PDF

Motivation: In order to function, proteins frequently bind to one another and form 3D assemblies. Knowledge of the atomic details of these structures helps our understanding of how proteins work together, how mutations can lead to disease, and facilitates the designing of drugs which prevent or mimic the interaction.

Results: Atomic modeling of protein-protein interactions requires the selection of near-native structures from a set of docked poses based on their calculable properties.

View Article and Find Full Text PDF

Reliable identification of near-native poses of docked protein-protein complexes is still an unsolved problem. The intrinsic heterogeneity of protein-protein interactions is challenging for traditional biophysical or knowledge based potentials and the identification of many false positive binding sites is not unusual. Often, ranking protocols are based on initial clustering of docked poses followed by the application of an energy function to rank each cluster according to its lowest energy member.

View Article and Find Full Text PDF

The sixth CAPRI edition included new modeling challenges, such as the prediction of protein-peptide complexes, and the modeling of homo-oligomers and domain-domain interactions as part of the first joint CASP-CAPRI experiment. Other non-standard targets included the prediction of interfacial water positions and the modeling of the interactions between proteins and nucleic acids. We have participated in all proposed targets of this CAPRI edition both as predictors and as scorers, with new protocols to efficiently use our docking and scoring scheme pyDock in a large variety of scenarios.

View Article and Find Full Text PDF

We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank.

View Article and Find Full Text PDF

We present an updated and integrated version of our widely used protein-protein docking and binding affinity benchmarks. The benchmarks consist of non-redundant, high-quality structures of protein-protein complexes along with the unbound structures of their components. Fifty-five new complexes were added to the docking benchmark, 35 of which have experimentally measured binding affinities.

View Article and Find Full Text PDF

Mutations at protein-protein recognition sites alter binding strength by altering the chemical nature of the interacting surfaces. We present a simple surface energy model, parameterized with empirical ΔΔG values, yielding mean energies of -48 cal mol(-1) Å(-2) for interactions between hydrophobic surfaces, -51 to -80 cal mol(-1) Å(-2) for surfaces of complementary charge, and 66-83 cal mol(-1) Å(-2) for electrostatically repelling surfaces, relative to the aqueous phase. This places the mean energy of hydrophobic surface burial at -24 cal mol(-1) Å(-2) .

View Article and Find Full Text PDF

σ(54)-dependent transcription controls a wide range of stress-related genes in bacteria and is tightly regulated. In contrast to σ(70), the σ(54)-RNA polymerase holoenzyme forms a stable closed complex at the promoter site that rarely isomerises into transcriptionally competent open complexes. The conversion into open complexes requires the ATPase activity of activator proteins that bind remotely upstream of the transcriptional start site.

View Article and Find Full Text PDF

Summary: The atomic structures of protein-protein interactions are central to understanding their role in biological systems, and a wide variety of biophysical functions and potentials have been developed for their characterization and the construction of predictive models. These tools are scattered across a multitude of stand-alone programs, and are often available only as model parameters requiring reimplementation. This acts as a significant barrier to their widespread adoption.

View Article and Find Full Text PDF

In the next generation sequencing era we are encountering hundreds of thousands of sequences from specific organisms. Such massive data must be accurately classified both functionally and structurally. Determining appropriate sequences with a specific function from next generation sequencing, however, is a daunting experimental task.

View Article and Find Full Text PDF
Article Synopsis
  • The study assesses the accuracy of predictions regarding the positions of water molecules at protein-protein interfaces, as part of the CAPRI experiment.
  • Out of 20 groups that provided 195 models, only 44% of high- or medium-quality docking models had a significant recall fraction for water-mediated contacts, highlighting challenges in accurately predicting these positions.
  • However, some predictions were notably successful, particularly regarding important hotspot water positions, suggesting that high-quality protein modeling and advanced computational techniques can enhance the accuracy of such predictions.
View Article and Find Full Text PDF

Background: Protein-protein docking, which aims to predict the structure of a protein-protein complex from its unbound components, remains an unresolved challenge in structural bioinformatics. An important step is the ranking of docked poses using a scoring function, for which many methods have been developed. There is a need to explore the differences and commonalities of these methods with each other, as well as with functions developed in the fields of molecular dynamics and homology modelling.

View Article and Find Full Text PDF

Predicting the effects of mutations on the kinetic rate constants of protein-protein interactions is central to both the modeling of complex diseases and the design of effective peptide drug inhibitors. However, while most studies have concentrated on the determination of association rate constants, dissociation rates have received less attention. In this work we take a novel approach by relating the changes in dissociation rates upon mutation to the energetics and architecture of hotspots and hotregions, by performing alanine scans pre- and post-mutation.

View Article and Find Full Text PDF