The modeling of nonlinear descriptor-target relationships is a topic of considerable interest in drug discovery. We, herein, continue reporting the use of the self-organizing map-a nonlinear, topology-preserving pattern recognition technique that exhibits considerable promise in modeling and decoding these relationships. Since simulated annealing is an efficient tool for solving optimization problems, we combined the supervised self-organizing map with simulated annealing to build high-quality, highly predictive quantitative structure-activity/property relationship models. This technique was applied to six data sets representing a variety of biological endpoints. Since a high statistical correlation in the training set does not indicate a highly predictive model, the quality of all the models was confirmed by withholding a portion of each data set for external validation. Finally, we introduce new cross-validation and dynamic partitioning techniques to address model overfitting and assessment.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1021/ci0500841 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!