The recent severe acute respiratory syndrome coronavirus 2 pandemic has clearly exemplified the need for broad-spectrum antiviral (BSA) medications. However, previous outbreaks show that about one year after an outbreak, interest in antiviral research diminishes and the work toward an effective medication is left unfinished. Martin et al.
View Article and Find Full Text PDFNatural olfactory systems possess remarkable sensitivity and precision beyond what is currently achievable by engineered gas sensors. Unlike their artificial counterparts, noses are capable of distinguishing scents associated with mixtures of volatile molecules in complex, typically fluctuating environments and can adapt to changes. This perspective examines the multifaceted biological principles that provide olfactory systems their discriminatory prowess, and how these ideas can be ported to the design of electronic noses for substantial improvements in performance across metrics such as sensitivity and ability to speciate chemical mixtures.
View Article and Find Full Text PDFTraditional best practices for quantitative structure activity relationship (QSAR) modeling recommend dataset balancing and balanced accuracy (BA) as the key desired objective of model development. This study explores the value of the conventional norms in the context of using QSAR models for virtual screening of modern large and ultra-large chemical libraries. For this increasingly common task, we now recommend the use of models with the highest positive predictive value (PPV) built on imbalanced training sets as preferred virtual screening tools.
View Article and Find Full Text PDFSkin sensitization is a significant concern for chemical safety assessments. Traditional animal assays often fail to predict human responses accurately, and ethical constraints limit the collection of human data, necessitating a need for reliable in silico models of skin sensitization prediction. This study introduces HuSSPred, an in silico tool based on the Human Predictive Patch Test (HPPT).
View Article and Find Full Text PDFHelicases have emerged as promising targets for the development of antiviral drugs; however, the family remains largely undrugged. To support the focused development of viral helicase inhibitors we identified, collected, and integrated all chemogenomics data for all available helicases from the ChEMBL database. After thoroughly curating and enriching the data with relevant annotations we have created a derivative database of helicase inhibitors which we dubbed Heli-SMACC (Helicase-targeting SMAll Molecule Compound Collection).
View Article and Find Full Text PDFThe Structural Genomics Consortium is an international open science research organization with a focus on accelerating early-stage drug discovery, namely hit discovery and optimization. We, as many others, believe that artificial intelligence (AI) is poised to be a main accelerator in the field. The question is then how to best benefit from recent advances in AI and how to generate, format and disseminate data to enable future breakthroughs in AI-guided drug discovery.
View Article and Find Full Text PDFHeparan sulfate (HS), a sulfated polysaccharide abundant in the extracellular matrix, plays pivotal roles in various physiological and pathological processes by interacting with proteins. Investigating the binding selectivity of HS oligosaccharides to target proteins is essential, but the exhaustive inclusion of all possible oligosaccharides in microarray experiments is impractical. To address this challenge, we present a hybrid pipeline that integrates microarray and in silico techniques to design oligosaccharides with desired protein affinity.
View Article and Find Full Text PDFWe introduce STOPLIGHT, a web portal to assist medicinal chemists in prioritizing hits from screening campaigns and the selection of compounds for optimization. STOPLIGHT incorporates services to assess 6 physiochemical and structural properties, 6 assay liabilities, and 11 pharmacokinetic properties, for any small molecule represented by its SMILES string. We briefly describe each service and illustrate the utility of this portal with a case study.
View Article and Find Full Text PDFNearest neighbor-based similarity searching is a common task in chemistry, with notable use cases in drug discovery. Yet, some of the most commonly used approaches for this task still leverage a brute-force approach. In practice this can be computationally costly and overly time-consuming, due in part to the sheer size of modern chemical databases.
View Article and Find Full Text PDFComputational models that predict pharmacokinetic properties are critical to deprioritize drug candidates that emerge as hits in high-throughput screening campaigns. We collected, curated, and integrated a database of compounds tested in 12 major end points comprising over 10,000 unique molecules. We then employed these data to build and validate binary quantitative structure-activity relationship (QSAR) models.
View Article and Find Full Text PDFStructure-based virtual screening (SBVS) is a key workflow in computational drug discovery. SBVS models are assessed by measuring the enrichment of known active molecules over decoys in retrospective screens. However, the standard formula for enrichment cannot estimate model performance on very large libraries.
View Article and Find Full Text PDFThere have been significant advances in the flexibility and power of cell-free translation systems. The increasing ability to incorporate noncanonical amino acids and complement translation with recombinant enzymes has enabled cell-free production of peptide-based natural products (NPs) and NP-like molecules. We anticipate that many more such compounds and analogs might be accessed in this way.
View Article and Find Full Text PDFSummary: Knowledge graphs are being increasingly used in biomedical research to link large amounts of heterogenous data and facilitate reasoning across diverse knowledge sources. Wider adoption and exploration of knowledge graphs in the biomedical research community is limited by requirements to understand the underlying graph structure in terms of entity types and relationships, represented as nodes and edges, respectively, and learn specialized query languages for graph mining and exploration. We have developed a user-friendly interface dubbed ExEmPLAR (Extracting, Exploring, and Embedding Pathways Leading to Actionable Research) to aid reasoning over biomedical knowledge graphs and assist with data-driven research and hypothesis generation.
View Article and Find Full Text PDFDeep learning methods that predict protein-ligand binding have recently been used for structure-based virtual screening. Many such models have been trained using protein-ligand complexes with known crystal structures and activities from the PDBBind data set. However, because PDBbind only includes 20K complexes, models typically fail to generalize to new targets, and model performance is on par with models trained with only ligand information.
View Article and Find Full Text PDFQuantitative structure-activity relationship (QSAR) modelling, an approach that was introduced 60 years ago, is widely used in computer-aided drug design. In recent years, progress in artificial intelligence techniques, such as deep learning, the rapid growth of databases of molecules for virtual screening and dramatic improvements in computational power have supported the emergence of a new field of QSAR applications that we term 'deep QSAR'. Marking a decade from the pioneering applications of deep QSAR to tasks involved in small-molecule drug discovery, we herein describe key advances in the field, including deep generative and reinforcement learning approaches in molecular design, deep learning models for synthetic planning and the application of deep QSAR models in structure-based virtual screening.
View Article and Find Full Text PDFVaccine repurposing that considers individual genotype may aid personalized prevention of Alzheimer's disease (AD). In this retrospective cohort study, we used Cardiovascular Health Study data to estimate associations of pneumococcal polysaccharide vaccine and flu shots received between ages 65-75 with AD onset at age 75 or older, taking into account rs6859 polymorphism in NECTIN2 gene (AD risk factor). Pneumococcal vaccine, and total count of vaccinations against pneumonia and flu, were associated with lower odds of AD in carriers of rs6859 A allele, but not in non-carriers.
View Article and Find Full Text PDFRecent rapid expansion of make-on-demand, purchasable, chemical libraries comprising dozens of billions or even trillions of molecules has challenged the efficient application of traditional structure-based virtual screening methods that rely on molecular docking. We present a novel computational methodology termed HIDDEN GEM (HIt Discovery using Docking ENriched by GEnerative Modeling) that greatly accelerates virtual screening. This workflow uniquely integrates machine learning, generative chemistry, massive chemical similarity searching and molecular docking of small, selected libraries in the beginning and the end of the workflow.
View Article and Find Full Text PDFWe report the major highlights of the School of Cheminformatics in Latin America, Mexico City, November 24-25, 2022. Six lectures, one workshop, and one roundtable with four editors were presented during an online public event with speakers from academia, big pharma, and public research institutions. One thousand one hundred eighty-one students and academics from seventy-nine countries registered for the meeting.
View Article and Find Full Text PDFIn the ligand prediction category of CASP15, the challenge was to predict the positions and conformations of small molecules binding to proteins that were provided as amino acid sequences or as models generated by the AlphaFold2 program. For most targets, we used our template-based ligand docking program ClusPro ligTBM, also implemented as a public server available at https://ligtbm.cluspro.
View Article and Find Full Text PDFHits from high-throughput screening (HTS) of chemical libraries are often false positives due to their interference with assay detection technology. In response, we generated the largest publicly available library of chemical liabilities and developed "Liability Predictor," a free web tool to predict HTS artifacts. More specifically, we generated, curated, and integrated HTS data sets for thiol reactivity, redox activity, and luciferase (firefly and nano) activity and developed and validated quantitative structure-interference relationship (QSIR) models to predict these nuisance behaviors.
View Article and Find Full Text PDFCOVID-19 vaccines have been instrumental tools in the fight against SARS-CoV-2 helping to reduce disease severity and mortality. At the same time, just like any other therapeutic, COVID-19 vaccines were associated with adverse events. Women have reported menstrual cycle irregularity after receiving COVID-19 vaccines, and this led to renewed fears concerning COVID-19 vaccines and their effects on fertility.
View Article and Find Full Text PDFUnderstanding the origins of past and present viral epidemics is critical in preparing for future outbreaks. Many viruses, including SARS-CoV-2, have led to significant consequences not only due to their virulence, but also because we were unprepared for their emergence. We need to learn from large amounts of data accumulated from well-studied, past pandemics and employ modern informatics and therapeutic development technologies to forecast future pandemics and help minimize their potential impacts.
View Article and Find Full Text PDFMolecular docking aims to predict the 3D pose of a small molecule in a protein binding site. Traditional docking methods predict ligand poses by minimizing a physics-inspired scoring function. Recently, a diffusion model has been proposed that iteratively refines a ligand pose.
View Article and Find Full Text PDFDiseases caused by new viruses cost thousands if not millions of human lives and trillions of dollars. We have identified, collected, curated, and integrated all chemogenomics data from ChEMBL for 13 emerging viruses that hold the greatest potential threat to global human health. By identifying and solving several challenges related to data annotation accuracy, we developed a highly curated and thoroughly annotated database of compounds tested in both phenotypic and target-based assays for these viruses that we dubbed SMACC (Small Molecule Antiviral Compound Collection).
View Article and Find Full Text PDFDeep generative neural networks have been used increasingly in computational chemistry for de novo design of molecules with desired properties. Many deep learning approaches employ reinforcement learning for optimizing the target properties of the generated molecules. However, the success of this approach is often hampered by the problem of sparse rewards as the majority of the generated molecules are expectedly predicted as inactives.
View Article and Find Full Text PDF