Machine learning (ML) models have become key in decision-making for many disciplines, including drug discovery and medicinal chemistry. ML models are generally evaluated prior to their usage in high-stakes decisions, such as compound synthesis or experimental testing. However, no ML model is robust or predictive in all real-world scenarios.
View Article and Find Full Text PDFTime-split cross-validation is broadly recognized as the gold standard for validating predictive models intended for use in medicinal chemistry projects. Unfortunately this type of data is not broadly available outside of large pharmaceutical research organizations. Here we introduce the SIMPD (simulated medicinal chemistry project data) algorithm to split public data sets into training and test sets that mimic the differences observed in real-world medicinal chemistry project data sets.
View Article and Find Full Text PDFMachine-learning and deep-learning models have been extensively used in cheminformatics to predict molecular properties, to reduce the need for direct measurements, and to accelerate compound prioritization. However, different setups and frameworks and the large number of molecular representations make it difficult to properly evaluate, reproduce, and compare them. Here we present a new PREdictive modeling FramEwoRk for molecular discovery (PREFER), written in Python (version 3.
View Article and Find Full Text PDFA longstanding goal has been to find an antigen-specific preventive therapy, i.e., a vaccine, for autoimmune diseases.
View Article and Find Full Text PDFMachine learning (ML) promises to tackle the grand challenges in chemistry and speed up the generation, improvement and/or ordering of research hypotheses. Despite the overarching applicability of ML workflows, one usually finds diverse evaluation study designs. The current heterogeneity in evaluation techniques and metrics leads to difficulty in (or the impossibility of) comparing and assessing the relevance of new algorithms.
View Article and Find Full Text PDFIn pharmaceutical research, compounds are optimized for metabolic stability to avoid a too fast elimination of the drug. Intrinsic clearance (CL) measured in liver microsomes or hepatocytes is an important parameter during lead optimization. In this work, machine learning models were developed to relate the compound structure to microsomal metabolic stability and predict CL for new compounds.
View Article and Find Full Text PDFObjectives: Rheumatoid arthritis (RA) is an autoimmune disease strongly associated with the major histocompatibility complex (MHC) class II allele DRB1*04:01, which encodes a protein that binds self-peptides for presentation to T cells. This study characterises the autoantigen-presenting function of DRB1*04:01 (HLA-DRA*01:01/HLA-DRB1*04:01) at a molecular level for prototypic T-cell determinants, focusing on a post-translationally modified collagen type II (Col2)-derived peptide.
Methods: The crystal structures of DRB1*04:01 molecules in complex with the peptides HSP70, citrullinated CILP and galactosylated Col2 were determined on cocrystallisation.
The ability to predict chemical reactivity of a molecule is highly desirable in drug discovery, both ex vivo (synthetic route planning, formulation, stability) and in vivo: metabolic reactions determine pharmacodynamics, pharmacokinetics and potential toxic effects, and early assessment of liabilities is vital to reduce attrition rates in later stages of development. Quantum mechanics offer a precise description of the interactions between electrons and orbitals in the breaking and forming of new bonds. Modern algorithms and faster computers have allowed the study of more complex systems in a punctual and accurate fashion, and answers for chemical questions around stability and reactivity can now be provided.
View Article and Find Full Text PDFBackground: A graded therapeutic concept for the treatment of chronic pain patients in Germany is only available to a limited extent. Following the onset of coronavirus disease (COVID-19), care for these patients has become even worse.
Aim: To develop and establish a cross-sector therapeutic concept for chronic pain patients as part of a selective contract.
Aims: This study investigates the effects of intra-articular injection of adipose-derived mesenchymal stem cells (AdMSCs) and platelet-rich plasma (PRP) on lameness, pain, and quality of life in osteoarthritic canine patients.
Methods: With informed owner consent, adipose tissue collected from adult dogs diagnosed with degenerative joint disease was enzymatically digested and cultured to passage 1. A small portion of cells (n = 4) surplus to clinical need were characterized using flow cytometry and tri-lineage differentiation.
Machine learning classifiers trained on class imbalanced data are prone to overpredict the majority class. This leads to a larger misclassification rate for the minority class, which in many real-world applications is the class of interest. For binary data, the classification threshold is set by default to 0.
View Article and Find Full Text PDFWe set up an automated screening process to routinely test 10 chiral supercritical fluid chromatography (SFC) methods - five columns combined with two co-solvents - as part of a chiral separation lab workflow. Proprietary software tools enabled automated method screening of racemates, parallel evaluation of the resulting chromatograms for enantiomer separation and report generation. This process is largely automated and resulted in an efficient and reliable lab process with a minimum requirement for human intervention.
View Article and Find Full Text PDFThis article summarizes the evolution of the screening deck at the Novartis Institutes for BioMedical Research (NIBR). Historically, the screening deck was an assembly of all available compounds. In 2015, we designed a first deck to facilitate access to diverse subsets with optimized properties.
View Article and Find Full Text PDFNodular lymphocyte-predominant Hodgkin lymphoma (NLPHL) is a rare lymphoma of B-cell origin with frequent expression of functional B-cell receptors (BCRs). Here we report that expression cloning followed by antigen screening identifies DNA-directed RNA polymerase beta' (RpoC) from Moraxella catarrhalis as frequent antigen of BCRs of IgD LP cells. Patients show predominance of HLA-DRB1*04/07 and the IgVH genes encode extraordinarily long CDR3s.
View Article and Find Full Text PDFMachine Learning and Data Science have enjoyed a renaissance due to the availability of increased computational power and larger data sets. Many questions can be now asked and answered, that previously were beyond our scope. This does not translate instantly into new tools that can be used by those not skilled in the field, as many of the issues and traps still exist.
View Article and Find Full Text PDFThe regulation of temporo-spatial compartmentalization of protein synthesis is of crucial importance for a variety of physiologic cellular functions. Here, we demonstrate that the cell membrane-anchored disintegrin metalloproteinase ADAM15, upregulated in a variety of aggressively growing tumor cells, in the hyperproliferative synovial membrane of inflamed joints as well as in osteoarthritic chondrocytes, transiently binds to poly(A) binding protein 1 (PABP) in cells undergoing adhesion. The cytoplasmic domain of ADAM15 was shown to selectively interact with the proline-rich linker of PABP.
View Article and Find Full Text PDFChirality is understood by many as a binary concept: a molecule is either chiral or it is not. In terms of the action of a structure on polarized light, this is indeed true. When examined through the prism of molecular recognition, the answer becomes more nuanced.
View Article and Find Full Text PDFBig data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: more and more data are being generated, for instance, by technologies such as DNA encoded libraries, peptide libraries, text mining of large literature corpora, and new in silico enumeration methods.
View Article and Find Full Text PDFToday, it is known that autoimmune diseases start a long time before clinical symptoms appear. Anti-citrullinated protein antibodies (ACPAs) appear many years before the clinical onset of rheumatoid arthritis (RA). However, it is still unclear if and how ACPAs are arthritogenic.
View Article and Find Full Text PDFReliable computational prediction of protein side chain conformations and the energetic impact of amino acid mutations are the key aspects for the optimization of biotechnologically relevant enzymatic reactions using structure-based design. By improving the protein stability, higher yields can be achieved. In addition, tuning the substrate selectivity of an enzymatic reaction by directed mutagenesis can lead to higher turnover rates.
View Article and Find Full Text PDFJ Chem Inf Model
December 2016
When analyzing chemical reactions it is essential to know which molecules are actively involved in the reaction and which educts will form the product molecules. Assigning reaction roles, like reactant, reagent, or product, to the molecules of a chemical reaction might be a trivial problem for hand-curated reaction schemes but it is more difficult to automate, an essential step when handling large amounts of reaction data. Here, we describe a new fingerprint-based and data-driven approach to assign reaction roles which is also applicable to rather unbalanced and noisy reaction schemes.
View Article and Find Full Text PDFMultiple recent studies have focused on unraveling the content of the medicinal chemist's toolbox. Here, we present an investigation of chemical reactions and molecules retrieved from U.S.
View Article and Find Full Text PDFFinding a canonical ordering of the atoms in a molecule is a prerequisite for generating a unique representation of the molecule. The canonicalization of a molecule is usually accomplished by applying some sort of graph relaxation algorithm, the most common of which is the Morgan algorithm. There are known issues with that algorithm that lead to noncanonical atom orderings as well as problems when it is applied to large molecules like proteins.
View Article and Find Full Text PDF