Similarity, like beauty, is an intuitive concept based on personal perception and bias. In the realm of molecular similarity, each method is user defined based on the features deemed important. A method's efficacy depends on the set of descriptors used to define the intermolecular similarity of chemicals and on the mathematical function used to quantify similarity. Quantitative molecular similarity analysis (QMSA) methods, based on experimental data or computed molecular descriptors, have emerged as powerful tools for analog selection and property estimation. We have carried out a comparative study of similarity spaces derived from atom pairs and a large set of topological indices for two diverse sets of chemicals: (a) a set of 469 chemicals with vapor pressure data from the TSCA inventory, and (b) a set of 213 chemicals with lipophilicity data from the STARLIST inventory. These spaces were used for the KNN-based estimation of properties (K = 1-10, 15, 20, 25). The results for the QMSA models developed in this paper are also compared with model estimates derived from hierarchical QSARs.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/s1093-3263(01)00104-8 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!