Dimensionality reduction is an important exploratory data analysis method that allows high-dimensional data to be represented in a human-interpretable lower-dimensional space. It is extensively applied in the analysis of chemical libraries, where chemical structure data - represented as high-dimensional feature vectors-are transformed into 2D or 3D chemical space maps. In this paper, commonly used dimensionality reduction techniques - Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and Generative Topographic Mapping (GTM) - are evaluated in terms of neighborhood preservation and visualization capability of sets of small molecules from the ChEMBL database.
View Article and Find Full Text PDFSpectrochim Acta A Mol Biomol Spectrosc
February 2025
Disease modifying therapies including interferon-β (IFNβ) effectively counteract the inflammatory component in relapsing-remitting multiple sclerosis (RRMS) but this action, generally associated with severe side effects, does not prevent axonal/neuronal damages. Hence, axonal neuroprotection, which is pivotal for MS effective treatment, remains a difficult clinical challenge. Growing evidence suggested as promising candidate for neuroprotection, Emapunil (AC-5216) or XBD173, a ligand of the mitochondrial translocator protein highly expressed in glial cells and neurons.
View Article and Find Full Text PDFThe advent of high-performance virtual screening techniques nowadays allows drug designers to explore ultra-large sets of candidate compounds in search of molecules predicted to have desired properties. However, the success of such an endeavor heavily relies on the pertinence (drug-likeness and, foremost, chemical feasibility) of these candidates, or otherwise, virtual screening will return valueless "hits", by the garbage in/garbage out principle. The huge popularity of the judiciously enumerated Enamine REAL Space is clear proof of the strength of this Big Data trend in drug discovery.
View Article and Find Full Text PDFHere, we present a new method for evaluating questions on chemical reactions in the context of remote education. This method can be used when binary grading is not sufficient as some tolerance may be acceptable. In order to determine a grade, the developed workflow uses the pairwise similarity assessment of two considered reactions, each encoded by a single molecular graph with the help of the Condensed Graph of Reaction (CGR) approach.
View Article and Find Full Text PDFThe Bone-Marrow derived Dendritic Cell (BMDC) test is a promising assay for identifying sensitizing chemicals based on the 3Rs (Replace, Reduce, Refine) principle. This study expanded the BMDC benchmarking to various in vitro, in chemico, and in silico assays targeting different key events (KE) in the skin sensitization pathway, using common substances datasets. Additionally, a Quantitative Structure-Activity Relationship (QSAR) model was developed to predict the BMDC test outcomes for sensitizing or non-sensitizing chemicals.
View Article and Find Full Text PDFIncreasing antimicrobial resistance (AMR) represents a global healthcare threat. To decrease the spread of AMR and associated mortality, methods for rapid selection of optimal antibiotic treatment are urgently needed. Machine learning (ML) models based on genomic data to predict resistant phenotypes can serve as a fast screening tool prior to phenotypic testing.
View Article and Find Full Text PDFThe cutaneous absorption parameters of xenobiotics are crucial for the development of drugs and cosmetics, as well as for assessing environmental and occupational chemical risks. Despite the great variability in the design of experimental conditions due to uncertain international guidelines, datasets like HuskinDB have been created to report skin absorption endpoints. This review updates available skin permeability data by rigorously compiling research published between 2012 and 2021.
View Article and Find Full Text PDFKinetic aqueous or buffer solubility is important parameter measuring suitability of compounds for high throughput assays in early drug discovery while thermodynamic solubility is reserved for later stages of drug discovery and development. Kinetic solubility is also considered to have low inter-laboratory reproducibility because of its sensitivity to protocol parameters [1]. Presumably, this is why little efforts have been put to build QSPR models for kinetic in comparison to thermodynamic aqueous solubility.
View Article and Find Full Text PDFThe COVID-19 pandemic continues to pose a substantial threat to human lives and is likely to do so for years to come. Despite the availability of vaccines, searching for efficient small-molecule drugs that are widely available, including in low- and middle-income countries, is an ongoing challenge. In this work, we report the results of an open science community effort, the "Billion molecules against COVID-19 challenge", to identify small-molecule inhibitors against SARS-CoV-2 or relevant human receptors.
View Article and Find Full Text PDFIn chemical library analysis, it may be useful to describe libraries as individual items rather than collections of compounds. This is particularly true for ultra-large noncherry-pickable compound mixtures, such as DNA-encoded libraries (DELs). In this sense, the chemical library space (CLS) is useful for the management of a portfolio of libraries, just like chemical space (CS) helps manage a portfolio of molecules.
View Article and Find Full Text PDFThis study introduces a new de novo design algorithm called that combines the capabilities of a deep-learning algorithm for automated drug-like analogue design, called , with a genetic algorithm for generating molecules with desired target-oriented properties. Specifically, was applied to the angiotensin-converting enzyme 2 (ACE2) target, which is implicated in many pathological conditions, including COVID-19. The ability of to de novo design promising candidates for a specific target was assessed using two docking programs, PLANTS and GLIDE.
View Article and Find Full Text PDFThe development of DNA-encoded library (DEL) technology introduced new challenges for the analysis of chemical libraries. It is often useful to consider a chemical library as a stand-alone chemoinformatic object─represented both as a collection of independent molecules, and yet an individual entity─in particular, when they are inseparable mixtures, like DELs. Herein, we introduce the concept of chemical library space (CLS), in which resident items are individual chemical libraries.
View Article and Find Full Text PDFCarbon capture and storage technologies are projected to increasingly contribute to cleaner energy transitions by significantly reducing CO emissions from fossil fuel-driven power and industrial plants. The industry standard technology for CO capture is chemical absorption with aqueous alkanolamines, which are often being mixed with an activator, piperazine, to increase the overall CO absorption rate. Inefficiency of the process due to the parasitic energy required for thermal regeneration of the solvent drives the search for new tertiary amines with better kinetics.
View Article and Find Full Text PDFIn order to analyze the Chimiothèque Nationale (CN) - The French National Compound Library - in the context of screening and biologically relevant compounds, the library was compared with ZINC in-stock collection and ChEMBL. This includes the study of chemical space coverage, physicochemical properties and Bemis-Murcko (BM) scaffold populations. More than 5 K CN-unique scaffolds (relative to ZINC and ChEMBL collections) were identified.
View Article and Find Full Text PDFIn order to better foramize it, the notorious inverse-QSAR problem (finding structures of given QSAR-predicted properties) is considered in this paper as a two-step process including (i) finding "seed" descriptor vectors corresponding to user-constrained QSAR model output values and (ii) identifying the chemical structures best matching the "seed" vectors. The main development effort here was focused on the latter stage, proposing a new attention-based conditional variational autoencoder neural-network architecture based on recent developments in attention-based methods. The obtained results show that this workflow was capable of generating compounds predicted to display desired activity while being completely novel compared to the training database (ChEMBL).
View Article and Find Full Text PDFWe report a novel approach for grading chemical structure drawings for remote teaching, integrated into the Moodle platform. Typically, existing online platforms use a binary grading system, which often fails to give a nuanced evaluation of the answers given by the students. Therefore, such platforms are unevenly adapted to different disciplines.
View Article and Find Full Text PDFNowadays, drug discovery is inevitably intertwined with the usage of large compound collections. Understanding of their chemotype composition and physicochemical property profiles is of the highest importance for successful hit identification. Efficient polyfunctional tools allowing multifaceted analysis of constantly growing chemical libraries must be Big Data-compatible.
View Article and Find Full Text PDFNew models for ACE2 receptor binding, based on QSAR and docking algorithms were developed, using XRD structural data and ChEMBL 26 database hits as training sets. The selectivity of the potential ACE2-binding ligands towards Neprilysin (NEP) and ACE was evaluated. The Enamine screening collection (3.
View Article and Find Full Text PDFDynamic combinatorial libraries (DCLs) display adaptive behavior, enabled by the reversible generation of their molecular constituents from building blocks, in response to external effectors, e.g., protein receptors.
View Article and Find Full Text PDFMolecular similarity is an impressively broad topic with many implications in several areas of chemistry. Its roots lie in the paradigm that 'similar molecules have similar properties'. For this reason, methods for determining molecular similarity find wide application in pharmaceutical companies, e.
View Article and Find Full Text PDFScreening of fragment libraries is a valuable approach to the drug discovery process. The quality of the library is one of the keys to success, and more particularly the design or choice of a library has to meet the specificities of the research program. In this study, we made an inventory of the commercial fragment libraries and we established a methodology which allows any library to be positioned in relation to the complete offer currently on the market, by addressing the following questions: does this chemical library look like another chemical library? What is the coverage of the current chemical space by this chemical library? What are the characteristic structural features of the fragments of this chemical library? We based our analysis on 2D and 3D chemical descriptors, framework class generation and the generative topographic map.
View Article and Find Full Text PDFNeuromyelitis optica spectrum disorder (NMOSD) and multiple sclerosis (MS) are both autoimmune inflammatory and demyelinating diseases of the central nervous system. NMOSD is a highly disabling disease and rapid introduction of the appropriate treatment at the acute phase is crucial to prevent sequelae. Specific criteria were established in 2015 and provide keys to distinguish NMOSD and MS.
View Article and Find Full Text PDFDNA-Encoded Library (DEL) technology has emerged as an alternative method for bioactive molecules discovery in medicinal chemistry. It enables the simple synthesis and screening of compound libraries of enormous size. Even though it gains more and more popularity each day, there are almost no reports of chemoinformatics analysis of DEL chemical space.
View Article and Find Full Text PDFThe ability to efficiently synthesize desired compounds can be a limiting factor for chemical space exploration in drug discovery. This ability is conditioned not only by the existence of well-studied synthetic protocols but also by the availability of corresponding reagents, so-called building blocks (BBs). In this work, we present a detailed analysis of the chemical space of 400 000 purchasable BBs.
View Article and Find Full Text PDFThe removal of CO from gases is an important industrial process in the transition to a low-carbon economy. The use of selective physical (co-)solvents is especially perspective in cases when the amount of CO is large as it enables one to lower the energy requirements for solvent regeneration. However, only a few physical solvents have found industrial application and the design of new ones can pave the way to more efficient gas treatment techniques.
View Article and Find Full Text PDF