Publications by authors named "Gordon Crippen"

BCS classification is a vital tool in the development of both generic and innovative drug products. The purpose of this work was to provisionally classify the world's top selling oral drugs according to the BCS, using in silico methods. Three different in silico methods were examined: the well-established group contribution (CLogP) and atom contribution (ALogP) methods, and a new method based solely on the molecular formula and element contribution (KLogP).

View Article and Find Full Text PDF

Here we describe a new algorithm for automatically determining the mainchain sequential assignment of NMR spectra for proteins. Using only the customary triple resonance experiments, assignments can be quickly found for not only small proteins having rather complete data, but also for large proteins, even when only half the residues can be assigned. The result of the calculation is not the single best assignment according to some criterion, but rather a large number of satisfactory assignments that are summarized in such a way as to help the user identify portions of the sequence that are assigned with confidence, vs.

View Article and Find Full Text PDF
Predicting pKa.

J Chem Inf Model

September 2009

One of the most important physicochemical properties of small molecules and macromolecules are the dissociation constants for any weakly acidic or basic groups, generally expressed as the pK(a) of each group. This is a major factor in the pharmacokinetics of drugs and in the interactions of proteins with other molecules. For both the protein and small molecule cases, we survey the sources of experimental pK(a) values and then focus on current methods for predicting them.

View Article and Find Full Text PDF

A simple, easily calculated, nonparametric statistic is described that can detect the presence of a functional relationship in bivariate data. Given a sample of data points (x,y), the statistic's value is nearly 1 if y is a linear function of x with little noise; it is greater than 1 if y is a nonlinear function of x; and it is close to 2 if x and y are uniformly and independently distributed. The statistic can be used to rapidly screen through large data sets to identify the most functionally related variable pairs.

View Article and Find Full Text PDF

Realizing favorable absorption, distribution, metabolism, elimination, and toxicity profiles is a necessity due to the high attrition rate of lead compounds in drug development today. The ability to accurately predict bioavailability can help save time and money during the screening and optimization processes. As several robust programs already exist for predicting logP, we have turned our attention to the fast and robust prediction of pK(a) for small molecules.

View Article and Find Full Text PDF

Elimination of cytotoxic compounds in the early and later stages of drug discovery can help reduce the costs of research and development. Through the application of principal components analysis (PCA), we were able to data mine and prove that approximately 89% of the total log GI 50 variance is due to the nonspecific cytotoxic nature of substances. Furthermore, PCA led to the identification of groups of structurally unrelated substances showing very specific toxicity profiles, such as a set of 45 substances toxic only to the Leukemia_SR cancer cell line.

View Article and Find Full Text PDF

The NCI Developmental Therapeutics Program Human Tumor cell line data set is a publicly available database that contains cellular assay screening data for over 40 000 compounds tested in 60 human tumor cell lines. The database also contains microarray assay gene expression data for the cell lines, and so it provides an excellent information resource particularly for testing data mining methods that bridge chemical, biological, and genomic information. In this paper we describe a formal knowledge discovery approach to characterizing and data mining this set and report the results of some of our initial experiments in mining the set from a chemoinformatics perspective.

View Article and Find Full Text PDF

Purpose: Cheminformatics can be broadly defined to encompass any activity related to the application of information technology to the study of properties, effects and uses of chemical agents. One of the most important current challenges in cheminformatics is to allow researchers to search databases of biomedical knowledge, using chemical structures as input.

Materials And Methods: An important step towards this goal was the establishment of PubChem, an open, centralized database of small molecules accessible through the World Wide Web.

View Article and Find Full Text PDF

Recently, we developed a pairwise structural alignment algorithm using realistic structural and environmental information (SAUCE). In this paper, we at first present an automatic fold hierarchical classification based on SAUCE alignments. This classification enables us to build a fold tree containing different levels of multiple structural profiles.

View Article and Find Full Text PDF

Motivation: Multiple STructural Alignment (MSTA) provides valuable information for solving problems such as fold recognition. The consistency-based approach tries to find conflict-free subsets of alignments from a pre-computed all-to-all Pairwise Alignment Library (PAL). If large proportions of conflicts exist in the library, consistency can be hard to get.

View Article and Find Full Text PDF

Rapid analysis of protein structure, interaction, and dynamics requires fast and automated assignments of 3D protein backbone triple-resonance NMR spectra. We introduce a new depth-first ordered tree search method of automated assignment, CASA, which uses hand-edited peak-pick lists of a flexible number of triple resonance experiments. The computer program was tested on 13 artificially simulated peak lists for proteins up to 723 residues, as well as on the experimental data for four proteins.

View Article and Find Full Text PDF

In the era of structural genomics, it is necessary to generate accurate structural alignments in order to build good templates for homology modeling. Although a great number of structural alignment algorithms have been developed, most of them ignore intermolecular interactions during the alignment procedure. Therefore, structures in different oligomeric states are barely distinguishable, and it is very challenging to find correct alignment in coil regions.

View Article and Find Full Text PDF

Biphenyl hydrolase-like (BPHL) protein is a novel serine hydrolase which has been identified as human valacyclovirase (VACVase), catalyzing the hydrolytic activation of valine ester prodrugs of the antiviral drugs acyclovir and ganciclovir as well as other amino acid ester prodrugs of therapeutic nucleoside analogues. The broad specificity for nucleoside analogues as parent drugs suggests that BPHL may be particularly useful as a molecular target for prodrug activation. In order to develop an initial structural view of the specificity of BPHL, a homology model of BPHL based on the crystal structure of 2-hydroxy-6-oxo-7-methylocta-2,4-dienoate hydrolase was developed using the Molecular Operating Environment package (Chemical Computing Group, Montreal, Quebec), evaluated for its stereochemical quality and identification of free cysteines, and used in a molecular docking study.

View Article and Find Full Text PDF

Cluster distance geometry is a recent generalization of distance geometry whereby protein structures can be described at even lower levels of detail than one point per residue. With improvements in the clustering technique, protein conformations can be summarized in terms of alternative contact patterns between clusters, where each cluster contains four sequentially adjacent amino acid residues. A very simple potential function involving 210 adjustable parameters can be determined that favors the native contacts of 31 small, monomeric proteins over their respective sets of nonnative contacts.

View Article and Find Full Text PDF

The purpose of this study is to explore the use of classification regression trees (CART) in predicting, in the dose-independent range, the fraction dose absorbed in humans. Since the results from clinical formulations in humans were used for training the model, a hypothetical state of drug molecules already dissolved in the intestinal fluid was adopted. Therefore, the molecular attributes affecting dissolution were not considered in the model.

View Article and Find Full Text PDF

This is our second type of model for protein folding where the configurational parameters and the effective potential energy function are chosen in such a way that all conformations are described and the canonical partition function can be evaluated analytically. Structure is described in terms of distances between pairs of sequentially contiguous blocks of eight residues, and all possible conformations are grouped into 71 subsets in terms of bounds on these distances. The energy is taken to be a sum of pairwise interactions between such blocks.

View Article and Find Full Text PDF

We have initiated an entirely new approach to statistical mechanical models of strongly interacting systems where the configurational parameters and the potential energy function are both constructed so that the canonical partition function can be evaluated analytically. For a simplified model of proteins consisting of a single, fairly short polypeptide chain without cross-links, we can adjust the energy parameters to favor the experimentally determined native state of seven proteins having diverse types of folds. Then 497 test proteins are predicted to have stable native folds, even though they are also structurally diverse, and 480 of them have no significant sequence similarity to any of the training proteins.

View Article and Find Full Text PDF

Given atomic coordinates for a particular conformation of a molecule and some property value assigned to each atom, one can easily calculate a chirality function that distinguishes enantiomers, is zero for an achiral molecule, and is a continuous function of the coordinates and properties. This is useful as a quantitative measure of chirality for molecular modeling and structure-activity relations.

View Article and Find Full Text PDF

Distance geometry has been a broadly useful tool for dealing with conformational calculations. Customarily each atom is represented as a point, constraints on the distances between some atoms are obtained from experimental or theoretical sources, and then a random sampling of conformations can be calculated that are consistent with the constraints. Although these methods can be applied to small proteins having on the order of 1000 atoms, for some purposes it is advantageous to view the problem at lower resolution.

View Article and Find Full Text PDF

Adequate conformational searching of small molecules and inclusion of a chirality identifier are necessary features of any current technique for quantitative structure-activity relationships (QSAR). However, implementation of these features can be difficult and computationally expensive, and some techniques can still lead to insufficient treatment of molecular conformation. We select the standard systematic conformational search as the default search method for our recent 3D QSAR program, DAPPER, and develop a novel chirality metric for use in QSAR.

View Article and Find Full Text PDF

A novel set of molecular descriptors suitable for use in quantitative structure-activity relationships and related methods is described. These descriptors are a smooth and interpretable representation of atomic physicochemical property values and intramolecular atom pair distances. Distance atomic physicochemical parameter energy relationships (DAPPER), a novel structure-activity relationship (QSAR) method using these descriptors, is validated on standard datasets.

View Article and Find Full Text PDF

Background: We present a simple method to train a potential function for the protein folding problem which, even though trained using a small number of proteins, is able to place a significantly large number of native conformations near a local minimum. The training relies on generating decoys by energy minimization of the native conformations using the current potential and using a physically meaningful objective function (derivative of energy with respect to torsion angles at the native conformation) during the quadratic programming to place the native conformation near a local minimum.

Results: We also compare the performance of three different types of energy functions and find that while the pairwise energy function is trainable, a solvation energy function by itself is untrainable if decoys are generated by minimizing the current potential starting at the native conformation.

View Article and Find Full Text PDF