Deep generative models have shown the ability to devise both valid and novel chemistry, which could significantly accelerate the identification of bioactive compounds. Many current models, however, use molecular descriptors or ligand-based predictive methods to guide molecule generation towards a desirable property space. This restricts their application to relatively data-rich targets, neglecting those where little data is available to sufficiently train a predictor. Moreover, ligand-based approaches often bias molecule generation towards previously established chemical space, thereby limiting their ability to identify truly novel chemotypes. In this work, we assess the ability of using molecular docking via Glide-a structure-based approach-as a scoring function to guide the deep generative model REINVENT and compare model performance and behaviour to a ligand-based scoring function. Additionally, we modify the previously published MOSES benchmarking dataset to remove any induced bias towards non-protonatable groups. We also propose a new metric to measure dataset diversity, which is less confounded by the distribution of heavy atom count than the commonly used internal diversity metric. With respect to the main findings, we found that when optimizing the docking score against DRD2, the model improves predicted ligand affinity beyond that of known DRD2 active molecules. In addition, generated molecules occupy complementary chemical and physicochemical space compared to the ligand-based approach, and novel physicochemical space compared to known DRD2 active molecules. Furthermore, the structure-based approach learns to generate molecules that satisfy crucial residue interactions, which is information only available when taking protein structure into account. Overall, this work demonstrates the advantage of using molecular docking to guide de novo molecule generation over ligand-based predictors with respect to predicted affinity, novelty, and the ability to identify key interactions between ligand and protein target. Practically, this approach has applications in early hit generation campaigns to enrich a virtual library towards a particular target, and also in novelty-focused projects, where de novo molecule generation either has no prior ligand knowledge available or should not be biased by it.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8117600PMC
http://dx.doi.org/10.1186/s13321-021-00516-0DOI Listing

Publication Analysis

Top Keywords

molecule generation
16
deep generative
12
ligand-based scoring
8
generative models
8
ability identify
8
molecular docking
8
scoring function
8
drd2 active
8
active molecules
8
physicochemical space
8

Similar Publications

AiGPro: a multi-tasks model for profiling of GPCRs for agonist and antagonist.

J Cheminform

January 2025

School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, 06978, Seoul, Republic of Korea.

G protein-coupled receptors (GPCRs) play vital roles in various physiological processes, making them attractive drug discovery targets. Meanwhile, deep learning techniques have revolutionized drug discovery by facilitating efficient tools for expediting the identification and optimization of ligands. However, existing models for the GPCRs often focus on single-target or a small subset of GPCRs or employ binary classification, constraining their applicability for high throughput virtual screening.

View Article and Find Full Text PDF

Plasma is considered as the fourth state of matter, and atmospheric cold plasma (cold plasma) is a type of plasma consisting of ionized gases containing excited species of atoms, molecules, ions, and free radicals at near room temperature. Cold plasma is generated by applying high voltage to gases, causing it to ionize thus forming plasma. Although cold plasma has been found to break seed dormancy and improve germination rate, only a few studies have explored the potential of cold plasma against insect herbivory.

View Article and Find Full Text PDF

Generative models have revolutionized de novo drug design, allowing to produce molecules on-demand with desired physicochemical and pharmacological properties. String based molecular representations, such as SMILES (Simplified Molecular Input Line Entry System) and SELFIES (Self-Referencing Embedded Strings), have played a pivotal role in the success of generative approaches, thanks to their capacity to encode atom- and bond- information and ease-of-generation. However, such 'atom-level' string representations could have certain limitations, in terms of capturing information on chirality, and synthetic accessibility of the corresponding designs.

View Article and Find Full Text PDF

Electrochemical Denitrative Cyclization Driven by Alternating Polarity.

Org Lett

January 2025

Key Laboratory of Molecule Synthesis and Function Discovery (Fujian Province University), College of Chemistry, Fuzhou University, Fuzhou 350108, China.

Alternating current electrolysis has emerged as a promising technique for addressing challenging redox reactions that are otherwise difficult or impossible for direct current electrolysis. Under mild and transition-metal-free reaction conditions, a general electrochemical denitrative cyclization of nitroarenes was developed to access various cyclic sulfone-containing derivatives of biological significance. The key to success lies in the facile manipulation of multiple redox events upon rapid alternating polarity switching to enhance the selectivity and efficiency.

View Article and Find Full Text PDF

Effective modulation of gene expression in plants is achievable through tools like CRISPR and RNA interference, yet methods for directly modifying endogenous proteins remain lacking. Here, we identify the E3 ubiquitin ligase E3TCD1 and develope a Targeted Condensation-prone-protein Degradation (TCD) strategy. The X-E3TCD1 fusion protein acts as a genetically engineered degrader, selectively targeting endogenous proteins prone to condensation.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!