Water solubility is an important characteristic of a chemical in many aspects. However experimental definition of the endpoint for all substances is impossible. In this study quantitative structure-property relationships (QSPRs) for negative logarithm of water solubility-logS (mol L(-1)) are built up for five random splits into the sub-training set (≈55%), the calibration set (≈25%), and the test set (≈20%). Simplified molecular input-line entry system (SMILES) is used as the representation of the molecular structure. Optimal SMILES-based descriptors are calculated by means of the Monte Carlo method using the CORAL software (http://www.insilico.eu/coral). These one-variable models for water solubility are characterized by the following average values of the statistical characteristics: n(sub_train)=725-763; n(calib)=312-343; n(test)=231-261; r(sub_train)(2)=0.9211±0.0028; r(calib)(2)=0.9555±0.0045; r(test)(2)=0.9365±0.0073; s(sub_train)=0.561±0.0086; s(calib)=0.453±0.0209; s(test)=0.520±0.0205. Thus, the reproducibility of statistical quality of suggested models for water solubility confirmed for five various splits.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.chemosphere.2012.07.035 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!