A grid-based, alignment-independent 3D-SDAR (three-dimensional spectral data-activity relationship) approach based on simulated C and N NMR chemical shifts augmented with through-space interatomic distances was used to model the mutagenicity of 554 primary and 419 secondary aromatic amines. A robust modeling strategy supported by extensive validation including randomized training/hold-out test set pairs, validation sets, "blind" external test sets as well as experimental validation was applied to avoid over-parameterization and build Organization for Economic Cooperation and Development (OECD 2004) compliant models. Based on an experimental validation set of 23 chemicals tested in a two-strain Salmonella typhimurium Ames assay, 3D-SDAR was able to achieve performance comparable to 5-strain (Ames) predictions by Lhasa Limited's Derek and Sarah Nexus for the same set.
View Article and Find Full Text PDFA dataset of 237 human Ether-à-go-go Related Gene (hERG) potassium channel inhibitors (180 of which were used for model building and validation, whereas 57 constituted the "true" external prediction set) collected from 22 literature sources was modeled by 3D-SDAR. To produce reliable and reproducible classification models for hERG blocking, the initial set of 180 chemicals was split into two subsets: a balanced modeling set consisting of 118 compounds and an unbalanced validation set comprised of 62 compounds. A PLS bagging-like algorithm written in Matlab was used to process the data and assign each compound to one of the two (hERG+ or hERG-) activity classes.
View Article and Find Full Text PDFThe estrogenic potential (expressed as a score composite of 18 high throughput screening bioassays) of 1528 compounds from the ToxCast database was modeled by a 3-dimensional spectral data activity relationship approach (3D-SDAR). Due to a lack of O nuclear magnetic resonance (NMR) simulation software, the most informative carbon-carbon 3D-SDAR fingerprints were augmented with indicator variables representing oxygen atoms from carbonyl and carboxamide, ester, sulfonyl, nitro, aliphatic hydroxyl, and phenolic hydroxyl groups. To evaluate the true predictive performance of the authors' model the United States Environmental Protection Agency provided them with a blind test set consisting of 2008 compounds.
View Article and Find Full Text PDFInvasion and metastasis are responsible for 90% of cancer-related mortality. Herein, we report on our quest for novel, clinically relevant inhibitors of local invasion, based on a broad screen of natural products in a phenotypic assay. Starting from micromolar chalcone hits, a predictive QSAR model for diaryl propenones was developed, and synthetic analogues with a 100-fold increase in potency were obtained.
View Article and Find Full Text PDFModified 3D-SDAR fingerprints combining (13)C and (15)N NMR chemical shifts augmented with inter-atomic distances were used to model the potential of chemicals to induce phospholipidosis (PLD). A curated dataset of 328 compounds (some of which were cationic amphiphilic drugs) was used to generate 3D-QSDAR models based on tessellations of the 3D-SDAR space with grids of different density. Composite PLS models averaging the aggregated predictions from 100 fully randomized individual models were generated.
View Article and Find Full Text PDFA diverse set of 154 chemicals that included US Food and Drug Administration-regulated compounds tested for their aquatic toxicity in Daphnia magna were modeled by a 3-dimensional quantitative spectral data-activity relationship (3D-QSDAR). Two distinct algorithms, partial least squares (PLS) and Tanimoto similarity-based k-nearest neighbors (KNN), were used to process bin occupancy descriptor matrices obtained after tessellation of the 3D-QSDAR space into regularly sized bins. The performance of models utilizing bins ranging in size from 2 ppm × 2 ppm × 0.
View Article and Find Full Text PDFMultiple validation techniques (Y-scrambling, complete training/test set randomization, determination of the dependence of R2test on the number of randomization cycles, etc.) aimed to improve the reliability of the modeling process were utilized and their effect on the statistical parameters of the models was evaluated. A consensus partial least squares (PLS)-similarity based k-nearest neighbors (KNN) model utilizing 3D-SDAR (three dimensional spectral data-activity relationship) fingerprint descriptors for prediction of the log(1/EC50) values of a dataset of 94 aryl hydrocarbon receptor binders was developed.
View Article and Find Full Text PDFAn improved three-dimensional quantitative spectral data-activity relationship (3D-QSDAR) methodology was used to build and validate models relating the activity of 130 estrogen receptor binders to specific structural features. In 3D-QSDAR, each compound is represented by a unique fingerprint constructed from (13)C chemical shift pairs and associated interatomic distances. Grids of different granularity can be used to partition the abstract fingerprint space into congruent "bins" for which the optimal size was previously unexplored.
View Article and Find Full Text PDFNicotinic acetylcholine receptors (nAChRs) have become targets for drug development in recent years. 3-(2,4-dimethoxybenzylidene)-anabaseine (DMXBA), which selectively stimulates the alpha7 nAChR, has been shown to alleviate some cognitive deficits associated with schizophrenia. In this paper we report an analysis of the interactions between 47 arylidene-anabaseines (including 45 benzylidene-anabaseines) and rat brain alpha7 and alpha4beta2 nicotinic acetylcholine receptors, using three different modeling techniques, namely 2D-QSAR, 3D-QSAR and molecular docking to the Aplysia californica acetylcholine binding protein (AChBP), a water soluble, homomeric nAChR surrogate receptor with a known crystal structure.
View Article and Find Full Text PDFThe photolysis half-lives of 70 polychlorinated dibenzo-p-dioxins and dibenzofurans are correlated with their molecular structures by a QSPR model (R(2) = 0.72) comprising three bond-energy-related descriptors. The photodegradation depends on the stability of the aromatic system and the C-O and C-C bond strengths.
View Article and Find Full Text PDFThe experimental EC(50) toxicities toward Daphnia magna for a series of 130 benzoic acids, benzaldehydes, phenylsulfonyl acetates, cycloalkane-carboxylates, benzanilides, and other esters were studied using the Best multilinear regression algorithm (BMLR) implemented in CODESSA. A modified quantitative structure-activity relationships (QSAR) procedure was applied guaranteeing the stability and reproducibility of the results. Separating the initial data set into training and test subsets generated three independent models with an average R(2) of .
View Article and Find Full Text PDFThe molecular structures of 83 diverse organic compounds are correlated by a quantitative structure-activity relationship (QSAR) to their minimum inhibitor concentrations (MIC expressed as log(1/MIC)), involving 6 descriptors with R(2)=0.788, F=47.140, s(2)=0.
View Article and Find Full Text PDFLiterature UV absorption intensities at 260 nm and 25 degrees C in water of a diverse set of 805 organic compounds when analyzed by CODESSA Pro software using an initial pool of 800 + descriptors provide a significant QSPR correlation (R (2) = 0.692). Concurrently, a neural networks approach was used to develop a corresponding nonlinear model.
View Article and Find Full Text PDF