Data mining methods based on machine learning play an increasingly important role in drug design and discovery. In the current work, eight machine learning methods including decision trees, k-Nearest neighbor, support vector machines, random forests, extremely randomized trees, AdaBoost, gradient boosting trees, and XGBoost were evaluated comprehensively through a case study of ACC inhibitor data sets. Internal and external data sets were employed for cross-validation of the eight machine learning methods. Results showed that the extremely randomized trees model performed best and was adopted as the first step of virtual screening. Together with structure-based virtual screening in the second step, this combined strategy obtained desirable results. This work indicates that the combination of machine learning methods with traditional structure-based virtual screening can effectively strengthen the ability in finding potential hits from large compound database for a given target.

Download full-text PDF

Source
http://dx.doi.org/10.1111/cbdd.13494DOI Listing

Publication Analysis

Top Keywords

machine learning
20
learning methods
12
virtual screening
12
based machine
8
extremely randomized
8
randomized trees
8
data sets
8
structure-based virtual
8
machine
5
learning
5

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!