Selecting a diverse set of solvents to be included in polymorph screening assignments can be a challenging task. As an aid to decision making, a database of 218 organic solvents with 24 property descriptors was explored and visualized using multivariate tools. The descriptors included, among others, log P, vapor pressure, hydrogen bond formation capabilities, polarity, number of pi-bonds and descriptors derived from molecular interaction field calculations (e.g., size/shape parameters and hydrophilic/hydrophobic regions). The data matrix was initially analyzed using principal component analysis (PCA). Results from the PCA showed 57% cumulative variance being explained in the first two principal components (PCs), although relevant information was also found in the third, fourth and fifth component, revealing distinct clusters of solvents. Since five dimensions were not suitable for visual presentation, a nonlinear method, self-organizing maps (SOMs), was applied to the dataset. The constructed SOM displayed features of clusters observed in the first three PCs, however in a more compelling way. Thus, the SOM was chosen as the visually most convenient way to display the diversity of the 218 solvents. In addition, it was demonstrated how safety aspects can be considered by labeling a large fraction of the solvents in the SOM with toxicological information.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1002/jps.21153 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!