Federated multipartner machine learning has been touted as an appealing and efficient method to increase the effective training data volume and thereby the predictivity of models, particularly when the generation of training data is resource-intensive. In the landmark MELLODDY project, indeed, each of ten pharmaceutical companies realized aggregated improvements on its own classification or regression models through federated learning. To this end, they leveraged a novel implementation extending multitask learning across partners, on a platform audited for privacy and security.
View Article and Find Full Text PDFTo select the most promising screening hits from antibody and VHH display campaigns for subsequent in-depth profiling and optimization, it is highly desirable to assess and select sequences on properties beyond only their binding signals from the sorting process. In addition, developability risk criteria, sequence diversity, and the anticipated complexity for sequence optimization are relevant attributes for hit selection and optimization. Here, we describe an approach for the in silico developability assessment of antibody and VHH sequences.
View Article and Find Full Text PDFProtein kinases are among the most important drug targets because their dysregulation can cause cancer, inflammatory and degenerative diseases, and many more. Developing selective inhibitors is challenging due to the highly conserved binding sites across the roughly 500 human kinases. Thus, detecting subtle similarities on a structural level can help explain and predict off-targets among the kinase family.
View Article and Find Full Text PDFPraziquantel (PZQ) is an essential medicine for treating parasitic flatworm infections such as schistosomiasis, which afflicts over 250 million people. However, PZQ is not universally effective, lacking activity against liver flukes of the genus. The reason for this insensitivity is unclear, as the mechanism of PZQ action is unknown.
View Article and Find Full Text PDFUnderstanding the pharmacokinetic (PK) properties of a drug, such as clearance, is a crucial step for evaluating efficacy. The PK of therapeutic antibodies can be complex and is influenced by interactions with the target, Fc-receptors, anti-drug antibodies, and antibody intrinsic factors. A growing body of literature has linked biophysical properties of antibodies, particularly nonspecific-binding propensity, hydrophobicity and charged regions to rapid clearance in preclinical species and selected human PK studies.
View Article and Find Full Text PDFThe first reported receptor for SARS-CoV-2 on host cells was the angiotensin-converting enzyme 2 (ACE2). However, the viral spike protein also has an RGD motif, suggesting that cell surface integrins may be co-receptors. We examined the sequences of ACE2 and integrins with the Eukaryotic Linear Motif (ELM) resource and identified candidate short linear motifs (SLiMs) in their short, unstructured, cytosolic tails with potential roles in endocytosis, membrane dynamics, autophagy, cytoskeleton, and cell signaling.
View Article and Find Full Text PDFAccurate ranking of compounds with regards to their binding affinity to a protein using computational methods is of great interest to pharmaceutical research. Physics-based free energy calculations are regarded as the most rigorous way to estimate binding affinity. In recent years, many retrospective studies carried out both in academia and industry have demonstrated its potential.
View Article and Find Full Text PDFIn this paper, we explore the impact of combining different in silico prediction approaches and data sources on the predictive performance of the resulting system. We use inhibition of the hERG ion channel target as the endpoint for this study as it constitutes a key safety concern in drug development and a potential cause of attrition. We will show that combining data sources can improve the relevance of the training set in regard of the target chemical space, leading to improved performance.
View Article and Find Full Text PDFMatched molecular pair (MMP) analyses are widely used in compound optimization projects to gain insights into structure-activity relationships (SAR). The analysis is traditionally done via statistical methods but can also be employed together with machine learning (ML) approaches to extrapolate to novel compounds. The here introduced MMP/ML method combines a fragment-based MMP implementation with different machine learning methods to obtain automated SAR decomposition and prediction.
View Article and Find Full Text PDFJ Comput Aided Mol Des
January 2018
Physics-based free energy simulations have increasingly become an important tool for predicting binding affinity and the recent introduction of automated protocols has also paved the way towards a more widespread use in the pharmaceutical industry. The D3R 2016 Grand Challenge 2 provided an opportunity to blindly test the commercial free energy calculation protocol FEP+ and assess its performance relative to other affinity prediction methods. The present D3R free energy prediction challenge was built around two experimental data sets involving inhibitors of farnesoid X receptor (FXR) which is a promising anticancer drug target.
View Article and Find Full Text PDFBMC Bioinformatics
January 2017
Background: Annotations of the phylogenetic tree of the human kinome is an intuitive way to visualize compound profiling data, structural features of kinases or functional relationships within this important class of proteins. The increasing volume and complexity of kinase-related data underlines the need for a tool that enables complex queries pertaining to kinase disease involvement and potential therapeutic uses of kinase inhibitors.
Results: Here, we present KinMap, a user-friendly online tool that facilitates the interactive navigation through kinase knowledge by linking biochemical, structural, and disease association data to the human kinome tree.
Kinome-wide screening would have the advantage of providing structure-activity relationships against hundreds of targets simultaneously. Here, we report the generation of ligand-based activity prediction models for over 280 kinases by employing Machine Learning methods on an extensive data set of proprietary bioactivity data combined with open data. High quality (AUC > 0.
View Article and Find Full Text PDFSimulations of the long-time scale motions of a ligand binding pocket in a protein may open up new perspectives for the design of compounds with steric or chemical properties differing from those of known binders. However, slow motions of proteins are difficult to access using standard molecular dynamics (MD) simulations and are thus usually neglected in computational drug design. Here, we introduce two nonequilibrium MD approaches to identify conformational changes of a binding site and detect transient pockets associated with these motions.
View Article and Find Full Text PDFThe identification and design of selective compounds is important for the reduction of unwanted side effects as well as for the development of tool compounds for target validation studies. This is, in particular, true for therapeutically important protein families that possess conserved folds and have numerous members such as kinases. To support the design of selective kinase inhibitors, we developed a novel approach that allows identification of specificity determining subpockets between closely related kinases solely based on their three-dimensional structures.
View Article and Find Full Text PDFProtein kinases are involved in a variety of diseases including cancer, inflammation, and autoimmune disorders. Although the development of new kinase inhibitors is a major focus in pharmaceutical research, a large number of kinases remained so far unexplored in drug discovery projects. The selection and assessment of targets is an essential but challenging area.
View Article and Find Full Text PDFUnderstanding molecular recognition is one major requirement for drug discovery and design. Physicochemical and shape complementarity between two binding partners is the driving force during complex formation. In this study, the impact of shape within this process is analyzed.
View Article and Find Full Text PDFWe present TRAPP (TRAnsient Pockets in Proteins), a new automated software platform for tracking, analysis, and visualization of binding pocket variations along a protein motion trajectory or within an ensemble of protein structures that may encompass conformational changes ranging from local side chain fluctuations to global backbone motions. TRAPP performs accurate grid-based calculations of the shape and physicochemical characteristics of a binding pocket for each structure and detects the conserved and transient regions of the pocket in an ensemble of protein conformations. It also provides tools for tracing the opening of a particular subpocket and residues that contribute to the binding site.
View Article and Find Full Text PDFDue to the rising number of solved protein structures, computer-based techniques for automatic protein functional annotation and classification into families are of high scientific interest. DoGSiteScorer automatically calculates global descriptors for self-predicted pockets based on the 3D structure of a protein. Protein function predictors on three levels with increasing granularity are built by use of a support vector machine (SVM), based on descriptors of 26632 pockets from enzymes with known structure and enzyme classification.
View Article and Find Full Text PDFMotivation: Many drug discovery projects fail because the underlying target is finally found to be undruggable. Progress in structure elucidation of proteins now opens up a route to automatic structure-based target assessment. DoGSiteScorer is a newly developed automatic tool combining pocket prediction, characterization and druggability estimation and is now available through a web server.
View Article and Find Full Text PDFPredicting druggability and prioritizing certain disease modifying targets for the drug development process is of high practical relevance in pharmaceutical research. DoGSiteScorer is a fully automatic algorithm for pocket and druggability prediction. Besides consideration of global properties of the pocket, also local similarities shared between pockets are reflected.
View Article and Find Full Text PDFA three-step approach for multiscale modeling of protein conformational changes is presented that incorporates information about preferred directions of protein motions into a geometric simulation algorithm. The first two steps are based on a rigid cluster normal-mode analysis (RCNMA). Low-frequency normal modes are used in the third step (NMSim) to extend the recently introduced idea of constrained geometric simulations of diffusive motions in proteins by biasing backbone motions of the protein, whereas side-chain motions are biased toward favorable rotamer states.
View Article and Find Full Text PDFPreviously (Hähnke et al., J Comput Chem 2010, 31, 2810) we introduced the concept of nonlinear dimensionality reduction for canonization of two-dimensional layouts of molecular graphs as foundation for text-based similarity searching using our Pharmacophore Alignment Search Tool (PhAST), a ligand-based virtual screening method. Here we apply these methods to three-dimensional molecular conformations and investigate the impact of these additional degrees of freedom on virtual screening performance and assess differences in ranking behavior.
View Article and Find Full Text PDFPreviously, (Hähnke et al., J Comput Chem 2009, 30, 761) we presented the Pharmacophore Alignment Search Tool (PhAST), a ligand-based virtual screening technique representing molecules as strings coding pharmacophoric features and comparing them by global pairwise sequence alignment. To guarantee unambiguity during the reduction of two-dimensional molecular graphs to one-dimensional strings, PhAST employs a graph canonization step.
View Article and Find Full Text PDFGiven the three-dimensional structure of a protein, how can one find the sites where other molecules might bind to it? Do these sites have the properties necessary for high affinity binding? Is this protein a suitable target for drug design? Here, we discuss recent developments in computational methods to address these and related questions. Geometric methods to identify pockets on protein surfaces have been developed over many years but, with new algorithms, their performance is still improving. Simulation methods show promise in accounting for protein conformational variability to identify transient pockets but lack the ease of use of many of the (rigid) shape-based tools.
View Article and Find Full Text PDF