Decision tree-based identification of important molecular fragments for protein-ligand binding.

Chem Biol Drug Des

Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, China.

Published: January 2024

Fragment-based drug design is an emerging technology in pharmaceutical research and development. One of the key aspects of this technology is the identification and quantitative characterization of molecular fragments. This study presents a strategy for identifying important molecular fragments based on molecular fingerprints and decision tree algorithms and verifies its feasibility in predicting protein-ligand binding affinity. Specifically, the three-dimensional (3D) structures of protein-ligand complexes are encoded using extended-connectivity fingerprints (ECFP), and three decision tree models, namely Random Forest, XGBoost, and LightGBM, are used to quantitatively characterize the feature importance, thereby extracting important molecular fragments with high reliability. Few-shot learning reveals that the extracted molecular fragments contribute significantly and consistently to the binding affinity even with a small sample size. Despite the absence of location and distance information for molecular fragments in ECFP, 3D visualization, in combination with the reverse ECFP process, shows that the majority of the extracted fragments are located at the binding interface of the protein and the ligand. This alignment with the distance constraints critical for binding affinity further supports the reliability of the strategy for identifying important molecular fragments.

Download full-text PDF

Source
http://dx.doi.org/10.1111/cbdd.14427DOI Listing

Publication Analysis

Top Keywords

molecular fragments
28
binding affinity
12
molecular
8
fragments
8
protein-ligand binding
8
strategy identifying
8
identifying molecular
8
decision tree
8
binding
5
decision tree-based
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!