Publications by authors named "Charlotte Deane"

Motivation: Machine learning-based scoring functions (MLBSFs) have been found to exhibit inconsistent performance on different benchmarks and be prone to learning dataset bias. For the field to develop MLBSFs that learn a generalisable understanding of physics, a more rigorous understanding of how they perform is required.

Results: In this work, we compared the performance of a diverse set of popular MLBSFs (RFScore, SIGN, OnionNet-2, Pafnucy, and PointVS) to our proposed baseline models that can only learn dataset biases on a range of benchmarks.

View Article and Find Full Text PDF

Therapeutic antibodies are manufactured, stored and administered in the free state; this makes understanding the unbound form key to designing and improving development pipelines. Prediction of unbound antibodies is challenging, specifically modelling of the CDRH3 loop, where inaccuracies are potentially worse due to a bias in structural data towards antibody-antigen complexes. This class imbalance provides a challenge for deep learning models trained on this data, potentially limiting generalisation to unbound forms.

View Article and Find Full Text PDF

Current strategies centred on either merging or linking initial hits from fragment-based drug design (FBDD) crystallographic screens generally do not fully leaverage 3D structural information. We show that an algorithmic approach (Fragmenstein) that 'stitches' the ligand atoms from this structural information together can provide more accurate and reliable predictions for protein-ligand complex conformation than general methods such as pharmacophore-constrained docking. This approach works under the assumption of conserved binding: when a larger molecule is designed containing the initial fragment hit, the common substructure between the two will adopt the same binding mode.

View Article and Find Full Text PDF

Key functions of antibodies, such as viral neutralisation, depend on high-affinity binding. However, viral neutralisation poorly correlates with antigen affinity for reasons that have been unclear. Here, we use a new mechanistic model of bivalent binding to study  >45 patient-isolated IgG1 antibodies interacting with SARS-CoV-2 RBD surfaces.

View Article and Find Full Text PDF

We introduce , an antibody variable domain diffusion model based on a general protein backbone diffusion framework, which was extended to handle multiple chains. Assessing the designability and novelty of the structures generated with our model, we find that produces highly designable antibodies that can contain novel binding regions. The backbone dihedral angles of sampled structures show good agreement with a reference antibody distribution.

View Article and Find Full Text PDF

Background: T cells form one of the key pillars of adaptive immunity. Using their surface bound T cell antigen receptors (TCRs), these cells screen millions of antigens presented by major histocompatibility complex (MHC) or MHC-like molecules. In other protein families, the dynamics of protein-protein interactions have important implications for protein function.

View Article and Find Full Text PDF

Antibodies are proteins produced by the immune system that can identify and neutralise a wide variety of antigens with high specificity and affinity, and constitute the most successful class of biotherapeutics. With the advent of next-generation sequencing, billions of antibody sequences have been collected in recent years, though their application in the design of better therapeutics has been constrained by the sheer volume and complexity of the data. To address this challenge, we present IgBert and IgT5, the best performing antibody-specific language models developed to date which can consistently handle both paired and unpaired variable region sequences as input.

View Article and Find Full Text PDF

Antibodies are a popular and powerful class of therapeutic due to their ability to exhibit high affinity and specificity to target proteins. However, the majority of antibody therapeutics are not genetically human, with initial therapeutic designs typically obtained from animal models. Humanization of these precursors is essential to reduce immunogenic risks when administered to humans.

View Article and Find Full Text PDF

Summary: A key challenge in antibody drug discovery is designing novel sequences that are free from developability issues-such as aggregation, polyspecificity, poor expression, or low solubility. Here, we present p-IgGen, a protein language model for paired heavy-light chain antibody generation. The model generates diverse, antibody-like sequences with pairing properties found in natural antibodies.

View Article and Find Full Text PDF

Motivation: The versatile binding properties of antibodies have made them an extremely important class of biotherapeutics. However, therapeutic antibody development is a complex, expensive, and time-consuming task, with the final antibody needing to not only have strong and specific binding but also be minimally impacted by developability issues. The success of transformer-based language models in protein sequence space and the availability of vast amounts of antibody sequences, has led to the development of many antibody-specific language models to help guide antibody design.

View Article and Find Full Text PDF

Many studies have prophesied that the integration of machine learning techniques into small-molecule therapeutics development will help to deliver a true leap forward in drug discovery. However, increasingly advanced algorithms and novel architectures have not always yielded substantial improvements in results. In this Perspective, we propose that a greater focus on the data for training and benchmarking these models is more likely to drive future improvement, and explore avenues for future research and strategies to address these data challenges.

View Article and Find Full Text PDF

Nanobodies are essential proteins of the adaptive immune systems of camelid and shark species, complementing conventional antibodies. Properties such as their relatively small size, solubility and high thermostability make VHH (variable heavy domain of the heavy chain) and VNAR (variable new antigen receptor) modalities a promising therapeutic format and a valuable resource for a wide range of biological applications. The volume of academic literature and patents related to nanobodies has risen significantly over the past decade.

View Article and Find Full Text PDF

Engineered antibody formats, such as antibody fragments and bispecifics, have the potential to offer improved therapeutic efficacy compared to traditional full-length monoclonal antibodies (mAbs). However, the translation of these non-natural molecules into successful therapeutics can be hampered by developability challenges. Here, we systematically analyzed 64 different antibody constructs targeting Tumor Necrosis Factor (TNF) which cover 8 distinct molecular format families, encompassing full-length antibodies, various types of single chain variable fragments, and bispecifics.

View Article and Find Full Text PDF

Summary: In this article, we introduce ABodyBuilder3, an improved and scalable antibody structure prediction model based on ABodyBuilder2. We achieve a new state-of-the-art accuracy in the modelling of CDR loops by leveraging language model embeddings, and show how predicted structures can be further improved through careful relaxation strategies. Finally, we incorporate a predicted Local Distance Difference Test into the model output to allow for a more accurate estimation of uncertainties.

View Article and Find Full Text PDF

Motivation: Antibody-antigen complex modelling is an important step in computational workflows for therapeutic antibody design. While experimentally determined structures of both antibody and the cognate antigen are often not available, recent advances in machine learning-driven protein modelling have enabled accurate prediction of both antibody and antigen structures. Here, we analyse the ability of protein-protein docking tools to use machine learning generated input structures for information-driven docking.

View Article and Find Full Text PDF

Recent breakthroughs in protein structure prediction have enhanced the precision and speed at which protein configurations can be determined. Additionally, molecular dynamics (MD) simulations serve as a crucial tool for capturing the conformational space of proteins, providing valuable insights into their structural fluctuations. However, the scope of MD simulations is often limited by the accessible timescales and the computational resources available, posing challenges to comprehensively exploring protein behaviors.

View Article and Find Full Text PDF

T cell activation is governed through T cell receptors (TCRs), heterodimers of two sequence-variable chains (often an α and β chain) that synergistically recognize antigen fragments presented on cell surfaces. Despite this, there only exist repositories dedicated to collecting single-chain, not paired-chain, TCR sequence data. We addressed this gap by creating the Observed TCR Space (OTS) database, a source of consistently processed and annotated, full-length, paired-chain TCR sequences.

View Article and Find Full Text PDF
Article Synopsis
  • Monoclonal antibodies are vital in fighting viral infections and are key players in managing pandemics, sourced from antibody-secreting cells (ASCs) like plasma cells.
  • Current methods to identify these antibodies are often slow, costly, or technically complex, limiting their widespread use.
  • This new technology streamlines the process by using microfluidics and flow cytometry to rapidly discover high-affinity monoclonal antibodies from millions of ASCs in just two weeks, achieving a success rate of over 85%.
View Article and Find Full Text PDF
Article Synopsis
  • Outbreaks of Ebolaviruses, like Sudanvirus in Uganda in 2022, highlight the need for vaccines that target more than just the Zaire ebolavirus, which is currently the only one in use.
  • A new vaccine regimen, Ad26.ZEBOV/MVA-BN-Filo, was developed and tested in the EBL2001 clinical trial, aiming to enhance immunity against various Ebolaviruses.
  • Researchers sequenced B cell receptors from trial participants and created a database of Ebolavirus-specific antibodies, revealing important patterns in immune responses and demonstrating the potential for computational techniques to analyze immune repertoires effectively.
View Article and Find Full Text PDF

To be viable therapeutics, antibodies must be tolerated by the human immune system. Rational approaches to reduce the risk of unwanted immunogenicity involve maximizing the 'humanness' of the candidate drug. However, despite the emergence of new discovery technologies, many of which start from entirely human gene fragments, most antibody therapeutics continue to be derived from non-human sources with concomitant humanization to increase their human compatibility.

View Article and Find Full Text PDF

T cells are essential immune cells responsible for identifying and eliminating pathogens. Through interactions between their T-cell antigen receptors (TCRs) and antigens presented by major histocompatibility complex molecules (MHCs) or MHC-like molecules, T cells discriminate foreign and self peptides. Determining the fundamental principles that govern these interactions has important implications in numerous medical contexts.

View Article and Find Full Text PDF

Antibodies are generated with great diversity in nature resulting in a set of molecules, each optimized to bind a specific target. Taking advantage of their diversity and specificity, antibodies make up for a large part of recently developed biologic drugs. For therapeutic use antibodies need to fulfill several criteria to be safe and efficient.

View Article and Find Full Text PDF

Immunomodulatory imide drugs (IMiDs) including thalidomide, lenalidomide, and pomalidomide, can be used to induce degradation of a protein of interest that is fused to a short zinc finger (ZF) degron motif. These IMiDs, however, also induce degradation of endogenous neosubstrates, including IKZF1 and IKZF3. To improve degradation selectivity, we took a bump-and-hole approach to design and screen bumped IMiD analogs against 8380 ZF mutants.

View Article and Find Full Text PDF