The goal of this manuscript is to discuss important aspects of external validation of classification and category Quantitative Structure - Activity/Property/Toxicity Relationship QS/A/P/T/R models that to the best of author's knowledge are not addressed in publications. Statistical significance (in terms of p-value) and accuracy of prediction (in terms of Correct Classification Rate (CCR)) of external validation set compounds are among most important characteristics of the models. We assert that in most cases the models built for classification or category response variable should be statistically significant and predictive for each class or category.
View Article and Find Full Text PDFMultiple approaches to quantitative structure-activity relationship (QSAR) modeling using various statistical or machine learning techniques and different types of chemical descriptors have been developed over the years. Oftentimes models are used in consensus to make more accurate predictions at the expense of model interpretation. We propose a simple, fast, and reliable method termed Multi-Descriptor Read Across (MuDRA) for developing both accurate and interpretable models.
View Article and Find Full Text PDFThe 5-hydroxytryptamine 1A (5-HT1A) serotonin receptor has been an attractive target for treating mood and anxiety disorders such as schizophrenia. We have developed binary classification quantitative structure-activity relationship (QSAR) models of 5-HT1A receptor binding activity using data retrieved from the PDSP Ki database. The prediction accuracy of these models was estimated by external 5-fold cross-validation as well as using an additional validation set comprising 66 structurally distinct compounds from the World of Molecular Bioactivity database.
View Article and Find Full Text PDFWe introduce a simple MODelability Index (MODI) that estimates the feasibility of obtaining predictive QSAR models (correct classification rate above 0.7) for a binary data set of bioactive compounds. MODI is defined as an activity class-weighted ratio of the number of nearest-neighbor pairs of compounds with the same activity class versus the total number of pairs.
View Article and Find Full Text PDFTraditional read-across approaches typically rely on the chemical similarity principle to predict chemical toxicity; however, the accuracy of such predictions is often inadequate due to the underlying complex mechanisms of toxicity. Here, we report on the development of a hazard classification and visualization method that draws upon both chemical structural similarity and comparisons of biological responses to chemicals measured in multiple short-term assays ("biological" similarity). The Chemical-Biological Read-Across (CBRA) approach infers each compound's toxicity from both chemical and biological analogues whose similarities are determined by the Tanimoto coefficient.
View Article and Find Full Text PDF