The goal of this manuscript is to discuss important aspects of external validation of classification and category Quantitative Structure - Activity/Property/Toxicity Relationship QS/A/P/T/R models that to the best of author's knowledge are not addressed in publications. Statistical significance (in terms of p-value) and accuracy of prediction (in terms of Correct Classification Rate (CCR)) of external validation set compounds are among most important characteristics of the models. We assert that in most cases the models built for classification or category response variable should be statistically significant and predictive for each class or category.
View Article and Find Full Text PDFMultiple approaches to quantitative structure-activity relationship (QSAR) modeling using various statistical or machine learning techniques and different types of chemical descriptors have been developed over the years. Oftentimes models are used in consensus to make more accurate predictions at the expense of model interpretation. We propose a simple, fast, and reliable method termed Multi-Descriptor Read Across (MuDRA) for developing both accurate and interpretable models.
View Article and Find Full Text PDFThe 5-hydroxytryptamine 1A (5-HT1A) serotonin receptor has been an attractive target for treating mood and anxiety disorders such as schizophrenia. We have developed binary classification quantitative structure-activity relationship (QSAR) models of 5-HT1A receptor binding activity using data retrieved from the PDSP Ki database. The prediction accuracy of these models was estimated by external 5-fold cross-validation as well as using an additional validation set comprising 66 structurally distinct compounds from the World of Molecular Bioactivity database.
View Article and Find Full Text PDFWe introduce a simple MODelability Index (MODI) that estimates the feasibility of obtaining predictive QSAR models (correct classification rate above 0.7) for a binary data set of bioactive compounds. MODI is defined as an activity class-weighted ratio of the number of nearest-neighbor pairs of compounds with the same activity class versus the total number of pairs.
View Article and Find Full Text PDFTraditional read-across approaches typically rely on the chemical similarity principle to predict chemical toxicity; however, the accuracy of such predictions is often inadequate due to the underlying complex mechanisms of toxicity. Here, we report on the development of a hazard classification and visualization method that draws upon both chemical structural similarity and comparisons of biological responses to chemicals measured in multiple short-term assays ("biological" similarity). The Chemical-Biological Read-Across (CBRA) approach infers each compound's toxicity from both chemical and biological analogues whose similarities are determined by the Tanimoto coefficient.
View Article and Find Full Text PDFQuantitative structure-activity relationship (QSAR) models have been developed for a data set of 3133 compounds defined as either active or inactive against P. falciparum. Because the data set was strongly biased toward inactive compounds, different sampling approaches were employed to balance the ratio of actives versus inactives, and models were rigorously validated using both internal and external validation approaches.
View Article and Find Full Text PDFPrior to using a quantitative structure activity relationship (QSAR) model for external predictions, its predictive power should be established and validated. In the absence of a true external data set, the best way to validate the predictive ability of a model is to perform its statistical external validation. In statistical external validation, the overall data set is divided into training and test sets.
View Article and Find Full Text PDFRemote loading of liposomes by trans-membrane gradients is used to achieve therapeutically efficacious intra-liposome concentrations of drugs. We have developed Quantitative Structure Property Relationship (QSPR) models of remote liposome loading for a data set including 60 drugs studied in 366 loading experiments internally or elsewhere. Both experimental conditions and computed chemical descriptors were employed as independent variables to predict the initial drug/lipid ratio (D/L) required to achieve high loading efficiency.
View Article and Find Full Text PDFSome antipsychotic drugs are known to cause valvular heart disease by activating serotonin 5-HT(2B) receptors. We have developed and validated binary classification QSAR models capable of predicting potential 5-HT(2B) actives. The classification accuracies of the models built to discriminate 5-HT(2B) actives from the inactives were as high as 80% for the external test set.
View Article and Find Full Text PDFBackground: Accurate prediction of in vivo toxicity from in vitro testing is a challenging problem. Large public-private consortia have been formed with the goal of improving chemical safety assessment by the means of high-throughput screening.
Objective: A wealth of available biological data requires new computational approaches to link chemical structure, in vitro data, and potential adverse health effects.
Purpose: Development of externally predictive Quantitative Structure-Activity Relationship (QSAR) models for Blood-Brain Barrier (BBB) permeability.
Methods: Combinatorial QSAR analysis was carried out for a set of 159 compounds with known BBB permeability data. All six possible combinations of three collections of descriptors derived from two-dimensional representations of molecules as chemical graphs and two QSAR methodologies have been explored.
The Quantitative Structure-Activity Relationship (QSAR) approach has been applied to model binding affinity and receptor subtype selectivity of human 5HT1E and 5HT1F receptor-ligands. The experimental data were obtained from the PDSP Ki Database. Several descriptor types and data-mining approaches have been used in the context of combinatorial QSAR modeling.
View Article and Find Full Text PDFThe use of inaccurate scoring functions in docking algorithms may result in the selection of compounds with high predicted binding affinity that nevertheless are known experimentally not to bind to the target receptor. Such falsely predicted binders have been termed 'binding decoys'. We posed a question as to whether true binders and decoys could be distinguished based only on their structural chemical descriptors using approaches commonly used in ligand based drug design.
View Article and Find Full Text PDFQuantitative Structure Activity Relationship (QSAR) modeling has been traditionally applied as an evaluative approach, i.e., with the focus on developing retrospective and explanatory models of existing data.
View Article and Find Full Text PDFA novel automated lazy learning quantitative structure-activity relationship (ALL-QSAR) modeling approach has been developed on the basis of the lazy learning theory. The activity of a test compound is predicted from a locally weighted linear regression model using chemical descriptors and the biological activity of the training set compounds most chemically similar to this test compound. The weights with which training set compounds are included in the regression depend on the similarity of those compounds to a test compound.
View Article and Find Full Text PDFThis paper reports the synthesis of a novel series of (+/-)-2-dimethylamino- 5- and 6-phenyl-1,2,3,4-tetrahydronaphthalene derivatives (5- and 6-APTs), and, corresponding affinity, functional activity, and, molecular modeling studies with regard to drug design targeting the human histamine H1 receptor. The 5-APTs have 2- to 4-fold higher H1 receptor affinity than the endogenous agonist histamine. The chemical nature of a meta-substituent on the 5-APT pendant phenyl moiety does not significantly affect H1 affinity.
View Article and Find Full Text PDFQuantitative structure-activity (property) relationship (QSAR/QSPR) models are typically generated with a single modeling technique using one type of molecular descriptors. Recently, we have begun to explore a combinatorial QSAR approach which employs various combinations of optimization methods and descriptor types and includes rigorous and consistent model validation (Kovatcheva, A.; Golbraikh, A.
View Article and Find Full Text PDFNovel geometrical chemical descriptors have been derived on the basis of the computational geometry of protein-ligand interfaces and Pauling atomic electronegativities (EN). Delaunay tessellation has been applied to a diverse set of 517 X-ray characterized protein-ligand complexes yielding a unique collection of interfacial nearest neighbor atomic quadruplets for each complex. Each quadruplet composition was characterized by a single descriptor calculated as the sum of the EN values for the four participating atom types.
View Article and Find Full Text PDFWe have developed quantitative structure-activity relationship (QSAR) models for 44 non-nucleoside HIV-1 reverse transcriptase inhibitors (NNRTIs) of the pyridinone derivative type. The k nearest neighbor (kNN) variable selection approach was used. This method utilizes multiple descriptors such as molecular connectivity indices, which are derived from two-dimensional molecular topology.
View Article and Find Full Text PDFWe have developed a drug discovery strategy that employs variable selection quantitative structure-activity relationship (QSAR) models for chemical database mining. The approach starts with the development of rigorously validated QSAR models obtained with the variable selection k nearest neighbor (kNN) method (or, in principle, with any other robust model-building technique). Model validation is based on several statistical criteria, including the randomization of the target property (Y-randomization), independent assessment of the training set model's predictive power using external test sets, and the establishment of the model's applicability domain.
View Article and Find Full Text PDFA combinatorial quantitative structure-activity relationships (Combi-QSAR) approach has been developed and applied to a data set of 98 ambergris fragrance compounds with complex stereochemistry. The Combi-QSAR approach explores all possible combinations of different independent descriptor collections and various individual correlation methods to obtain statistically significant models with high internal (for the training set) and external (for the test set) accuracy. Seven different descriptor collections were generated with commercially available MOE, CoMFA, CoMMA, Dragon, VolSurf, and MolconnZ programs; we also included chirality topological descriptors recently developed in our laboratory (Golbraikh, A.
View Article and Find Full Text PDFQuantitative Structure-Activity Relationship (QSAR) models are used increasingly to screen chemical databases and/or virtual chemical libraries for potentially bioactive molecules. These developments emphasize the importance of rigorous model validation to ensure that the models have acceptable predictive power. Using k nearest neighbors (kNN) variable selection QSAR method for the analysis of several datasets, we have demonstrated recently that the widely accepted leave-one-out (LOO) cross-validated R2 (q2) is an inadequate characteristic to assess the predictive ability of the models [Golbraikh, A.
View Article and Find Full Text PDFComputational ADME (absorption, distribution, metabolism, and excretion) models may be used early in the drug discovery process in order to flag drug candidates with potentially problematic ADME profiles. We report the development, validation, and application of quantitative structure-property relationship (QSPR) models of metabolic turnover rate for compounds in human S9 homogenate. Biological data were obtained from uniform bioassays of 631 diverse chemicals proprietary to GlaxoSmithKline (GSK).
View Article and Find Full Text PDFOne of the most important characteristics of Quantitative Structure Activity Relashionships (QSAR) models is their predictive power. The latter can be defined as the ability of a model to predict accurately the target property (e.g.
View Article and Find Full Text PDFJ Chem Inf Comput Sci
April 2003
Three-dimensional quantitative structure-activity relationship (3D-QSAR) models were developed for a series of 44 synthetic alpha-campholenic derivatives with sandalwood odor. These compounds have complex stereochemistry as they contain up to five chiral atoms. To address stereospecificity of odor intensity, a 3D-QSAR method was developed, which does not require spatial alignment of molecules.
View Article and Find Full Text PDF