The digitization of health records and cross-institutional data sharing is a necessary precondition to improve clinical research and patient care. The SMITH project unites several university hospitals and medical faculties in order to provide medical informatics solutions for health data integration and cross-institutional communication. In this paper, we focus on requirements elicitation and management for extracting clinical data from heterogeneous subsystems and data integration based on eHealth standards such as HL7 FHIR and IHE profiles.
View Article and Find Full Text PDFJ Chem Inf Model
September 2006
Traditional quantitative structure-activity relationship (QSAR) models aim to capture global structure-activity trends present in a data set. In many situations, there may be groups of molecules which exhibit a specific set of features which relate to their activity or inactivity. Such a group of features can be said to represent a local structure-activity relationship.
View Article and Find Full Text PDFJ Chem Inf Model
September 2006
Libraries of chemical structures are used in a variety of cheminformatics tasks such as virtual screening and QSAR modeling and are generally characterized using molecular descriptors. When working with libraries it is useful to understand the distribution of compounds in the space defined by a set of descriptors. We present a simple approach to the analysis of the spatial distribution of the compounds in a library in general and outlier detection in particular based on counts of neighbors within a series of increasing radii.
View Article and Find Full Text PDFVirtual screening (VS) has become a preferred tool to augment high-throughput screening(1) and determine new leads in the drug discovery process. The core of a VS informatics pipeline includes several data mining algorithms that work on huge databases of chemical compounds containing millions of molecular structures and their associated data. Thus, scaling traditional applications such as classification, partitioning, and outlier detection for huge chemical data sets without a significant loss in accuracy is very important.
View Article and Find Full Text PDFIn this work, we present a methodology to interpret the weights and biases of a computational neural network (CNN) quantitative structure-activity relationship model. The methodology allows one to understand how an input descriptor is correlated to the predicted output by the network. The method consists of two parts.
View Article and Find Full Text PDFJ Chem Inf Model
October 2005
We present a method to measure the relative importance of the descriptors present in a QSAR model developed with a computational neural network (CNN). The approach is based on a sensitivity analysis of the descriptors. We tested the method on three published data sets for which linear and CNN models were previously built.
View Article and Find Full Text PDFQuantitative structure activity relationships (QSAR) are one of the well-developed areas in computational chemistry. In this field, many successful predictive models have been developed for various property, activity or toxicity predictions. However, the predictive power of models for new query compounds is often not well characterized.
View Article and Find Full Text PDFQuinolone and quinoline are known to be liver carcinogens in rodents, and a number of their derivatives have been shown to exhibit mutagenicity in the Ames test, using Salmonella typhimurium strain TA 100 in the presence of S9. Both the carcinogenicity and the mutagenicity of quinolone and quinoline derivatives, as determined by SAS, can be attributed to their genotoxicity potential. This potential, which is measured by genotoxicity tests, is a good indication of carcinogenicity and mutagenicity because compounds that are positive in these tests have the potential to be human carcinogens and/or mutagens.
View Article and Find Full Text PDFThe determination of the validity of a QSAR model when applied to new compounds is an important concern in the field of QSAR and QSPR modeling. Various scoring techniques can be applied to specific types of models. We present a technique with which we can state whether a new compound will be well predicted by a previously built QSAR model.
View Article and Find Full Text PDFA QSAR modeling study has been done with a set of 79 piperazyinylquinazoline analogues which exhibit PDGFR inhibition. Linear regression and nonlinear computational neural network models were developed. The regression model was developed with a focus on interpretative ability using a PLS technique.
View Article and Find Full Text PDFA Kohonen self-organizing map (SOM) is used to classify a data set consisting of dihydrofolate reductase inhibitors with the help of an external set of Dragon descriptors. The resultant classification is used to generate training, cross-validation (CV) and prediction sets for QSAR modeling using the ADAPT methodology. The results are compared to those of QSAR models generated using sets created by activity binning and a sphere exclusion method.
View Article and Find Full Text PDFJ Chem Inf Comput Sci
December 2005
This work presents the development of Quantitative Structure-Activity Relationship (QSAR) models to predict the biological activity of 179 artemisinin analogues. The structures of the molecules are represented by chemical descriptors that encode topological, geometric, and electronic structure features. Both linear (multiple linear regression) and nonlinear (computational neural network) models are developed to link the structures to their reported biological activity.
View Article and Find Full Text PDFA new series of 25 whole-molecule molecular structure descriptors are proposed. The new descriptors are termed Hydrophobic Surface Area, or HSA descriptors, and are designed to capture information regarding the structural features responsible for hydrophobic and hydrophilic intermolecular interactions. The utility of the HSAs in capturing this type of information is demonstrated using two properties that have a known hydrophobic component.
View Article and Find Full Text PDFClassification models were developed to provide accurate prediction of genotoxicity of 277 polycyclic aromatic compounds (PACs) directly from their molecular structures. Numerical descriptors encoding the topological, geometric, electronic, and polar surface area properties of the compounds were calculated to represent the structural information. Each compound's genotoxicity was represented with IMAX (maximal SOS induction factor) values measured by the SOS Chromotest in the presence and absence of S9 rat liver homogenate.
View Article and Find Full Text PDFJ Chem Inf Comput Sci
October 2004
A data set of 297 diverse organic compounds that cause varying degrees of chromosomal aberrations in Chinese hamster lung cells is examined. Responses of an assay are categorized as clastogenic (>10% aberrant cells) and nonclastogenic (<5% aberrant cells). Each of the compounds is represented by calculated structural descriptors that encode topological, geometric, electronic, and polar surface features.
View Article and Find Full Text PDFHIV protease inhibitors are being used as frontline therapy in the treatment of HIV patients. Multi-drug-resistant HIV mutant strains are emerging with the initial aggressive multi-drug treatment of HIV patients. This necessitates continued search for novel inhibitors of viral replication.
View Article and Find Full Text PDFWe report several binary classification models that directly link the genetic toxicity of a series of 140 thiophene derivatives with information derived from the compounds' molecular structure. Genetic toxicity was measured using an SOS Chromotest. IMAX (maximal SOS induction factor) values were recorded for each of the 140 compounds both in the presence and in the absence of S9 rat liver homogenate.
View Article and Find Full Text PDFBinary quantitative structure-activity relationship (QSAR) models are developed to classify a data set of 334 aromatic and secondary amine compounds as genotoxic or nongenotoxic based on information calculated solely from chemical structure. Genotoxic endpoints for each compound were determined using the SOS Chromotest in both the presence and absence of an S9 rat liver homogenate. Compounds were considered genotoxic if assay results indicated a positive genotoxicity hit for either the S9 inactivated or S9 activated assay.
View Article and Find Full Text PDFJ Chem Inf Comput Sci
October 2003
Loss of Protein Tyrosine Phosphatase 1B (PTP 1B) activity is known to enhance insulin sensitivity and resistance to weight gain. So potent and orally active PTP1B inhibitors could be potential pharmacological agents for the treatment of Type 2 diabetes and obesity. Classification models of PTP1B inhibitors are developed using a data set containing 128 compounds.
View Article and Find Full Text PDFA data set of 348 urea-like compounds that inhibit the soluble epoxide hydrolase enzyme in mice and humans is examined. Compounds having IC(50) values ranging from 0.06 to >500 microM (murine) and 0.
View Article and Find Full Text PDFClassification models are generated to predict in vitro cytogenetic results for a diverse set of 383 organic compounds. Both k-nearest neighbor and support vector machine models are developed. They are based on calculated molecular structure descriptors.
View Article and Find Full Text PDFA data set of 345 dihydrofolate reductase inhibitors was used to build QSAR models that correlate chemical structure and inhibition potency for three types of dihydrofolate reductase (DHFR): rat liver (rl), Pneumocystis carinii (pc), and Toxoplasma gondii (tg). Quantitative models were built using subsets of molecular structure descriptors being analyzed by computational neural networks. Neural network models were able to accurately predict log IC(50) values for the three types of DHFR to within +/-0.
View Article and Find Full Text PDFJ Chem Inf Comput Sci
February 2003
The Probabilistic Neural Network (PNN) and its close relative, the Generalized Regression Neural Network (GRNN), are presented as simple yet powerful neural network techniques for use in Quantitative Structure-Activity Relationship (QSAR) and Quantitative Structure-Property Relationship (QSPR) studies. The PNN methodology is applicable to classification problems, and the GRNN is applicable to continuous function mapping problems. The basic underlying theory behind these probability-based methods is presented along with two applications of the PNN/GRNN methodology.
View Article and Find Full Text PDFJ Chem Inf Comput Sci
November 2002
The design and blood brain barrier crossing of glycine/NMDA receptor antagonists are of significant interest in pharmaceutical research. The use of these antagonists in stroke or seizure reduction have been considered. Measuring the inhibitory concentrations, however, can be time-consuming and costly.
View Article and Find Full Text PDFQuantitative structure-property relationships (QSPRs) have been developed to predict the ion mobility spectrometry (IMS) collision cross sections of singly protonated lysine-terminated peptides using information derived from topological molecular structure and various amino acid parameters. The primary amino acid sequence alone is sufficient to accurately predict the collision cross section. The models were built using multiple linear regression (MLR) and computational neural networks (CNNs).
View Article and Find Full Text PDF