J Chem Inf Model
January 2025
Three-dimensional (3D) molecular generation models employ deep neural networks to simultaneously generate both topological representation and molecular conformations. Due to their advantages in utilizing the structural and interaction information on targets, as well as their reduced reliance on existing bioactivity data, these models have attracted widespread attention. However, limited training and testing data sets and the unexpected biases inherent in single evaluation metrics pose a significant challenge in comparing these models in practical settings.
View Article and Find Full Text PDFProteolysis-targeting chimera (PROTAC) is an emerging therapeutic technology that leverages the ubiquitin-proteasome system to target protein degradation. Due to its event-driven mechanistic characteristics, PROTAC has the potential to regulate traditionally non-druggable targets. Recently, AI-aided drug design has accelerated the development of PROTAC drugs.
View Article and Find Full Text PDFDeep learning-based molecular design has recently gained significant attention. While numerous DL-based generative models have been successfully developed for designing novel compounds, the majority of the generated molecules lack sufficiently novel scaffolds or high drug-like profiles. The aforementioned issues may not be fully captured by commonly used metrics for the assessment of molecular generative models, such as novelty, diversity, and quantitative estimation of the drug-likeness score.
View Article and Find Full Text PDFDeep learning-based molecular generative models have garnered emerging attention for their capability to generate molecules with novel structures and desired physicochemical properties. However, the evaluation of these models, particularly in a biological context, remains insufficient. To address the limitations of existing metrics and emulate practical application scenarios, we construct the RediscMol benchmark that comprises active molecules extracted from 5 kinase and 3 GPCR data sets.
View Article and Find Full Text PDFHighly effective de novo design is a grand challenge of computer-aided drug discovery. Practical structure-specific three-dimensional molecule generations have started to emerge in recent years, but most approaches treat the target structure as a conditional input to bias the molecule generation and do not fully learn the detailed atomic interactions that govern the molecular conformation and stability of the binding complexes. The omission of these fine details leads to many models having difficulty in outputting reasonable molecules for a variety of therapeutic targets.
View Article and Find Full Text PDFSupramolecular chemistry offers new insights in bioimaging, but specific tracking of enzyme in living cells via supramolecular host-guest reporter pair remains challenging, largely due to the interference caused by the complex cellular environment on the binding between analytes and hosts. Here, by exploiting the principle of supramolecular tandem assay (STA) and the classic host-guest reporter pair (p-sulfonatocalix[4]arene (SC4A) and lucigenin (LCG)) and rationally designing artificial peptide library to screen sequence with high affinity of the target enzyme, we developed a "turn-on" fluorescent sensing system for intracellular imaging of histone deacetylase 1 (HDAC1), which is a potential therapeutic target for various diseases, including cancer, neurological, and cardiovascular diseases. Based on computational simulations and experimental validations, we verified that the deacetylated peptide by HDAC1 competed LCG, freeing it from the SC4A causing fluorescence increase.
View Article and Find Full Text PDFJ Cheminform
December 2022
Deep learning (DL) and machine learning contribute significantly to basic biology research and drug discovery in the past few decades. Recent advances in DL-based generative models have led to superior developments in de novo drug design. However, data availability, deep data processing, and the lack of user-friendly DL tools and interfaces make it difficult to apply these DL techniques to drug design.
View Article and Find Full Text PDFProteolysis targeting chimeras (PROTACs), which harness the ubiquitin-proteasome system to selectively induce targeted protein degradation, represent an emerging therapeutic technology with the potential to modulate traditional undruggable targets. Over the past few years, this technology has moved from academia to industry and more than 10 PROTACs have been advanced into clinical trials. However, designing potent PROTACs with desirable drug-like properties still remains a great challenge.
View Article and Find Full Text PDFDNA methyltransferase 3A (DNMT3A) has been regarded as a potential epigenetic target for the development of cancer therapeutics. A number of DNMT3A inhibitors have been reported, but most of them do not have good potency, high selectivity and/or low cytotoxicity. It has been suggested that a non-conserved region around the target recognition domain (TRD) loop is implicated in the DNMT3A activity under the allosteric regulation of the ATRX-DNMT3-DNMT3L (ADD) domain, but the molecular mechanism of the regulation of the TRD loop on the DNMT3A activity needs to be elucidated.
View Article and Find Full Text PDFDeep learning (DL)-based de novo molecular design has recently gained considerable traction. Many DL-based generative models have been successfully developed to design novel molecules, but most of them are ligand-centric and the role of the 3D geometries of target binding pockets in molecular generation has not been well-exploited. Here, we proposed a new 3D-based generative model called RELATION.
View Article and Find Full Text PDFDe novo drug design is the process of generating novel lead compounds with desirable pharmacological and physiochemical properties. The application of deep learning (DL) in de novo drug design has become a hot topic, and many DL-based approaches have been developed for molecular generation tasks. Generally, these approaches were developed as per four frameworks: recurrent neural networks; encoder-decoder; reinforcement learning; and generative adversarial networks.
View Article and Find Full Text PDFProteolysis-targeting chimeras (PROTACs), which selectively induce targeted protein degradation, represent an emerging drug discovery technology. Although numerous PROTACs have been reported, designing potent PROTACs still remains a great challenge, to some extent, due to insufficient structural data of Target-PROTAC-E3 ternary complexes. In this work, PROTAC-Model, an integrative computational method by combining the FRODOCK-based protocol and RosettaDock-based refinement, was developed to predict PROTAC-mediated ternary complex structures and tested on 14 cases.
View Article and Find Full Text PDFThe molecular mechanics/generalized Born surface area (MM/GBSA) has been widely used in end-point binding free energy prediction in structure-based drug design (SBDD). However, in practice, it is usually being treated as a disputed method mostly because of its system dependence. Here, combining with machine-learning optimization, we developed a novel version of MM/GBSA, named variable atomic dielectric MM/GBSA (VAD-MM/GBSA), by assigning variable dielectric constants directly to the protein/ligand atoms.
View Article and Find Full Text PDFVirtual screening (VS) based on molecular docking has emerged as one of the mainstream technologies of drug discovery due to its low cost and high efficiency. However, the scoring functions (SFs) implemented in most docking programs are not always accurate enough and how to improve their prediction accuracy is still a big challenge. Here, we propose an integrated platform called ASFP, a web server for the development of customized SFs for structure-based VS.
View Article and Find Full Text PDFMachine-learning (ML)-based scoring functions (MLSFs) have gradually emerged as a promising alternative for protein-ligand binding affinity prediction and structure-based virtual screening. However, clouds of doubts have still been raised against the benefits of this novel type of scoring functions (SFs). In this study, to benchmark the performance of target-specific MLSFs on a relatively unbiased dataset, the MLSFs trained from three representative protein-ligand interaction representations were assessed on the LIT-PCBA dataset, and the classical Glide SP SF and three types of ligand-based quantitative structure-activity relationship (QSAR) models were also utilized for comparison.
View Article and Find Full Text PDFNucleic Acids Res
January 2021
Inhibitors that form covalent bonds with their targets have traditionally been considered highly adventurous due to their potential off-target effects and toxicity concerns. However, with the clinical validation and approval of many covalent inhibitors during the past decade, design and discovery of novel covalent inhibitors have attracted increasing attention. A large amount of scattered experimental data for covalent inhibitors have been reported, but a resource by integrating the experimental information for covalent inhibitor discovery is still lacking.
View Article and Find Full Text PDFProteolysis-targeting chimeras (PROTACs), which selectively degrade targeted proteins by the ubiquitin-proteasome system, have emerged as a novel therapeutic technology with potential advantages over traditional inhibition strategies. In the past few years, this technology has achieved substantial progress and two PROTACs have been advanced into phase I clinical trials. However, this technology is still maturing and the design of PROTACs remains a great challenge.
View Article and Find Full Text PDFJ Chem Theory Comput
June 2020
A large number of protein-protein interactions (PPIs) are mediated by the interactions between proteins and peptide segments binding partners, and therefore determination of protein-peptide interactions (PpIs) is quite crucial to elucidate important biological processes and design peptides or peptidomimetic drugs that can modulate PPIs. Nowadays, as a powerful computation tool, molecular docking has been widely utilized to predict the binding structures of protein-peptide complexes. However, although a number of docking programs have been available, the systematic study on the assessment of their performance for PpIs has never been reported.
View Article and Find Full Text PDFIn structure-based drug design (SBDD), the molecular mechanics generalized Born surface area (MM/GBSA) approach has been widely used in ranking the binding affinity of small molecule ligands. However, an accurate estimation of protein-ligand binding affinity still remains a challenge due to the intrinsic limitation of the standard generalized Born (GB) model used in MM/GBSA. In this study, we proposed and evaluated the MM/GBSA approach based on a variable dielectric generalized Born (VDGB) model using residue-type-based dielectric constants.
View Article and Find Full Text PDFBackground: The aberrant expression of HER2 is highly associated with tumour occurrence and metastasis, therefore HER2 is extensively targeted for tumour immunotherapy. For example, trastuzumab and pertuzumab are FDA-approved monoclonal antibodies that target HER2-positive tumour cells. Despite their advances in clinical applications, emerging resistance to these two HER2-targeting antibodies has hindered their further application.
View Article and Find Full Text PDFEnhanced sampling has been extensively used to capture the conformational transitions in protein folding, but it attracts much less attention in the studies of protein-protein recognition. In this study, we evaluated the impact of enhanced sampling methods and solute dielectric constants on the overall accuracy of the molecular mechanics/Poisson-Boltzmann surface area (MM/PBSA) and molecular mechanics/generalized Born surface area (MM/GBSA) approaches for the protein-protein binding free energy calculations. Here, two widely used enhanced sampling methods, including aMD and GaMD, and conventional molecular dynamics (cMD) simulations with two AMBER force fields (ff03 and ff14SB) were used to sample the conformations for 21 protein-protein complexes.
View Article and Find Full Text PDFProtein-protein interactions (PPIs) play an important role in the different functions of cells, but accurate prediction of the three-dimensional structures for PPIs is still a notoriously difficult task. In this study, HawkDock, a free and open accessed web server, was developed to predict and analyze the structures of PPIs. In the HawkDock server, the ATTRACT docking algorithm, the HawkRank scoring function developed in our group and the MM/GBSA free energy decomposition analysis were seamlessly integrated into a multi-functional platform.
View Article and Find Full Text PDFA significant number of protein-protein interactions (PPIs) are mediated through the interactions between proteins and peptide segments, and therefore determination of protein-peptide interactions (PpIs) is critical to gain an in-depth understanding of the PPI network and even design peptides or small molecules capable of modulating PPIs. Computational approaches, especially molecular docking, provide an efficient way to model PpIs, and a reliable scoring function that can recognize the correct binding conformations for protein-peptide complexes is one of the most important components in protein-peptide docking. The end-point binding free energy calculation methods, such as MM/GBSA and MM/PBSA, are theoretically more rigorous than most empirical and semi-empirical scoring functions designed for protein-peptide docking, but their performance in predicting binding affinities and binding poses for protein-peptide systems has not been systematically assessed.
View Article and Find Full Text PDF