Structure-based drug design depends on the detailed knowledge of the three-dimensional (3D) structures of protein-ligand binding complexes, but accurate prediction of ligand-binding poses is still a major challenge for molecular docking due to deficiency of scoring functions (SFs) and ignorance of protein flexibility upon ligand binding. In this study, based on a cross-docking dataset dedicatedly constructed from the PDBbind database, we developed several XGBoost-trained classifiers to discriminate the near-native binding poses from decoys, and systematically assessed their performance with/without the involvement of the cross-docked poses in the training/test sets. The calculation results illustrate that using Extended Connectivity Interaction Features (ECIF), Vina energy terms and docking pose ranks as the features can achieve the best performance, according to the validation through the random splitting or refined-core splitting and the testing on the re-docked or cross-docked poses. Besides, it is found that, despite the significant decrease of the performance for the threefold clustered cross-validation, the inclusion of the Vina energy terms can effectively ensure the lower limit of the performance of the models and thus improve their generalization capability. Furthermore, our calculation results also highlight the importance of the incorporation of the cross-docked poses into the training of the SFs with wide application domain and high robustness for binding pose prediction. The source code and the newly-developed cross-docking datasets can be freely available at https://github.com/sc8668/ml_pose_prediction and https://zenodo.org/record/5525936 , respectively, under an open-source license. We believe that our study may provide valuable guidance for the development and assessment of new machine learning-based SFs (MLSFs) for the predictions of protein-ligand binding poses.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8520186 | PMC |
http://dx.doi.org/10.1186/s13321-021-00560-w | DOI Listing |
Sci Rep
May 2023
Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology (JUIT), Waknaghat, Solan, Himachal Pradesh, 173234, India.
In recent years, the outbreak of infectious disease caused by Zika Virus (ZIKV) has posed a major threat to global public health, calling for the development of therapeutics to treat ZIKV disease. Several possible druggable targets involved in virus replication have been identified. In search of additional potential inhibitors, we screened 2895 FDA-approved compounds using Non-Structural Protein 5 (NS5) as a target utilizing virtual screening of in-silco methods.
View Article and Find Full Text PDFJ Med Chem
August 2022
Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China.
The past few years have witnessed enormous progress toward applying machine learning approaches to the development of protein-ligand scoring functions. However, the robust performance and wide applicability of scoring functions remain a big challenge for increasing the success rate of docking-based virtual screening. Herein, a novel scoring function named RTMScore was developed by introducing a tailored residue-based graph representation strategy and several graph transformer layers for the learning of protein and ligand representations, followed by a mixture density network to obtain residue-atom distance likelihood potential.
View Article and Find Full Text PDFJ Cheminform
October 2021
Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, People's Republic of China.
J Chem Inf Model
September 2020
Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States.
One of the main challenges in drug discovery is predicting protein-ligand binding affinity. Recently, machine learning approaches have made substantial progress on this task. However, current methods of model evaluation are overly optimistic in measuring generalization to new targets, and there does not exist a standard data set of sufficient size to compare performance between models.
View Article and Find Full Text PDFComput Biol Chem
August 2017
Department of Pharmaceutical Chemistry, J.S.S. College of Pharmacy, Udhagamandalam, 643001, Tamil Nadu(1), India.
The binding modes of well known MurD inhibitors have been studied using molecular docking and molecular dynamics (MD) simulations. The docking results of inhibitors 1-30 revealed similar mode of interaction with Escherichia coli-MurD. Further, residues Thr36, Arg37, His183, Lys319, Lys348, Thr321, Ser415 and Phe422 are found to be important for inhibitors and E.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!