Exploring Chemical Space with Machine Learning.

Chimia (Aarau)

Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure, University of Bern, Freiestrasse 3, CH-3012 Bern;, Email:

Published: December 2019

Chemical space is a concept to organize molecular diversity by postulating that different molecules occupy different regions of a mathematical space where the position of each molecule is defined by its properties. Our aim is to develop methods to explicitly explore chemical space in the area of drug discovery. Here we review our implementations of machine learning in this project, including our use of deep neural networks to enumerate the GDB13 database from a small sample set, to generate analogs of drugs and natural products after training with fragment-size molecules, and to predict the polypharmacology of molecules after training with known bioactive compounds from ChEMBL. We also discuss visualization methods for big data as means to keep track and learn from machine learning results. Computational tools discussed in this review are freely available at and .

Download full-text PDF

Source
http://dx.doi.org/10.2533/chimia.2019.1018DOI Listing

Publication Analysis

Top Keywords

chemical space
12
machine learning
12
exploring chemical
4
space
4
space machine
4
learning chemical
4
space concept
4
concept organize
4
organize molecular
4
molecular diversity
4

Similar Publications

Generalized Hartree-Fock (GHF) is a long-established electronic structure method that can lower the energy (compared to spin-restricted variants) by breaking physical wave function symmetries, namely and . After an exposition of GHF theory, we assess the use of GHF trial wave functions in phaseless auxiliary field quantum Monte Carlo (ph-AFQMC-G) calculations of strongly correlated molecular systems including symmetrically stretched hydrogen rings, carbon dioxide, and dioxygen. Imaginary time propagation is able to restore symmetry and yields energies of comparable or better accuracy than CCSD(T) with unrestricted HF and GHF references, and consistently smooth dissociation curves─a remarkable result given the relative scalability of ph-AFQMC-G to larger system sizes.

View Article and Find Full Text PDF

Introduction: Schistosomiasis has for many years relied on a single drug, praziquantel (PZQ) for treatment of the disease. Immense efforts have been invested in the discovery of protein kinase (PK) inhibitors; however, given that the majority of PKs are still not targeted by an inhibitor with a useful level of selectivity, there is a compelling need to expand the chemical space available for synthesizing new, potent, and selective PK inhibitors. Small-molecule inhibitors targeting the ATP pocket of the catalytic domain of PKs have the potential to become drugs devoid of (major) side effects, particularly if they bind selectively.

View Article and Find Full Text PDF

FEgrow is an open-source software package for building congeneric series of compounds in protein binding pockets. For a given ligand core and receptor structure, it employs hybrid machine learning/molecular mechanics potential energy functions to optimise the bioactive conformers of supplied linkers and functional groups. Here, we introduce significant new functionality to automate, parallelise and accelerate the building and scoring of compound suggestions, such that it can be used for automated design.

View Article and Find Full Text PDF

Designing chemically novel and synthesizable ligands from the largest possible chemical space is a major issue in modern drug discovery to identify early hits that are easily amenable to medicinal chemistry optimization. Starting from the sole three-dimensional structure of a protein binding site, we herewith describe a fully automated active learning protocol to propose the commercial chemical reagents and one-step organic chemistry reactions necessary to enumerate target-specific primary hits from ultralarge chemical spaces. When applied in different scenarios (single transform and multiple transforms) addressing chemical spaces of various sizes (from 670 million to 4.

View Article and Find Full Text PDF

MultiChem: predicting chemical properties using multi-view graph attention network.

BioData Min

January 2025

Department of Computer Science, Hanyang University, Seoul, Republic of Korea.

Background: Understanding the molecular properties of chemical compounds is essential for identifying potential candidates or ensuring safety in drug discovery. However, exploring the vast chemical space is time-consuming and costly, necessitating the development of time-efficient and cost-effective computational methods. Recent advances in deep learning approaches have offered deeper insights into molecular structures.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!