Modeling aqueous solubility.

J Chem Inf Comput Sci

Computational Chemistry and Chemoinformatics, ArQule (UK) Limited, Science Park, Cambridge, UK.

Published: October 2003

This paper describes the development of an aqueous solubility model based on solubility data from the Syracuse database, calculated octanol-water partition coefficient, and 51 2D molecular descriptors. Two different statistical packages, SIMCA and Cubist, were used and the results were compared. The Cubist model, which comprises a collection of rules, each of which has an associated Multiple Linear Regression model (MLR), gave better overall results on a test set of 640 compounds with an overall squared correlation coefficient of 0.74 and an absolute average error of 0.68 log units. Both training and independent test sets had similar distributions of structures in terms of the different functionalities present-60% neutral, 14% acidic, 8% phenolic, 11% monobasic, 4% polybasic, and 3% zwitterionic molecules. Sets were designed by random selection, with 2688 (81%) and 640 (19%) molecules, respectively, forming the training and the test sets.

Download full-text PDF

Source
http://dx.doi.org/10.1021/ci020279yDOI Listing

Publication Analysis

Top Keywords

aqueous solubility
8
test sets
8
modeling aqueous
4
solubility paper
4
paper describes
4
describes development
4
development aqueous
4
solubility model
4
model based
4
based solubility
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!