113 results match your criteria: "Unilever Centre for Molecular Science Informatics[Affiliation]"
Chem Cent J
February 2008
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK.
Background: We have introduced a new Hybrid descriptor composed of the MACCS key descriptor encoding topological information and Ballester and Richards' Ultrafast Shape Recognition (USR) descriptor. The latter one is calculated from the moments of the distribution of the interatomic distances, and in this work we also included higher moments than in the original implementation.
Results: The performance of this Hybrid descriptor is assessed using Random Forest and a dataset of 116,476 molecules.
J Chem Inf Model
February 2008
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom.
In the present paper we combine the Winnow algorithm and an advanced scheme for feature generation into a tool for multiclass classification. The Winnow algorithm, specifically designed in the late 1980s to work well with high-dimensional data, by design ignores most of the irrelevant features for the scoring of each single training/test case. To augment the pool of available molecular features we use the Winnow algorithm in conjunction with a process that creates additional features from a set of given ones.
View Article and Find Full Text PDFJ Org Chem
February 2008
Unilever Centre for Molecular Science Informatics, University Chemical Laboratory, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK.
The boron-mediated aldol reactions of certain types of beta-alkoxy methyl ketone show remarkably high levels of stereoinduction with achiral aldehydes, leading preferentially to 1,5-anti related stereocenters. Given the low levels of asymmetric induction usually observed in acetate aldol reactions, this is of great synthetic utility and has been used successfully in the total synthesis of a number of polyketide natural products. We have investigated the effects of the alkoxy protecting group (OMe, OPMB, PMP acetal, tetrahydropyran, and OTBS) present in the boron enolate on the level and sense of remote 1,5-stereoinduction, using density functional theory calculations (B3LYP/6-31G**).
View Article and Find Full Text PDFJ Chem Inf Model
January 2008
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom.
This paper attempts to elucidate differences in QSPR models of aqueous solubility (Log S), melting point (Tm), and octanol-water partition coefficient (Log P), three properties of pharmaceutical interest. For all three properties, Support Vector Machine models using 2D and 3D descriptors calculated in the Molecular Operating Environment were the best models. Octanol-water partition coefficient was the easiest property to predict, as indicated by the RMSE of the external test set and the coefficient of determination (RMSE = 0.
View Article and Find Full Text PDFJ Org Chem
December 2007
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom.
Triazabicyclodecene (TBD) has recently been shown to be an effective organocatalyst for the ring-opening polymerization (ROP) of cyclic esters. Using DFT methods, we have studied possible mechanisms of this reaction. Our studies explain not only the narrow polydispersity index (PDI) observed in the ROP of six-membered ring lactones, but also the surprising failure of the ROP for the more reactive butyrolactone.
View Article and Find Full Text PDFJ Chem Inf Model
February 2008
Department of Chemistry, Unilever Centre for Molecular Science Informatics, Lensfield Road, Cambridge, U.K.
Saturated acyclic alkanes show steric strain if they are highly branched and, in extreme cases, fall apart rapidly at room temperature. Consequently, attempts to count the number of isomeric forms for a given molecular formula that neglect this physical consideration will inevitably overestimate the size of the available chemical space. Here we derive iterative equations to enumerate the number of isomers (both structural and optical are considered separately) for the alkane series that take into account the inherent instability of certain carbon skeletons.
View Article and Find Full Text PDFPhys Chem Chem Phys
October 2007
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, UKCB2 1EW.
In this work we are trying to gain insight into the mechanisms of ion-protein interactions in aqueous media at the molecular scale through fully atomistic molecular dynamics simulations. We present a systematic molecular simulation study of interactions of sodium and halide ions with a trialanine peptide in aqueous sodium halide solutions with different salts concentrations (0.20, 0.
View Article and Find Full Text PDFAdv Drug Deliv Rev
December 2007
Unilever Centre for Molecular Science Informatics, University Chemical Laboratory, University of Cambridge, Lensfield Road, Cambridge CB1 9SB, United Kingdom.
Polyoxazolines of various architectures and chemical functionalities can be prepared in a living and therefore controlled manner via cationic ring-opening polymerisation. They have found widespread applications, ranging from coatings to pigment dispersants. Furthermore, several polyoxazolines are water-soluble or amphiphilic and relatively non-toxic, which makes them interesting as biomaterials.
View Article and Find Full Text PDFExpert Opin Drug Metab Toxicol
August 2007
University of Cambridge, Unilever Centre for Molecular Science Informatics, Department of Chemistry, Lensfield Road, Cambridge, CB2 1EW, UK.
The abundance of different techniques and protocols available reflects the need for reliable in vitro methods to assess intestinal absorption of potentially bioactive compounds. Physicochemical assays try to pinpoint the molecular properties contributing to the absorption process. The end points of biologically based methods, such as cell cultures and excised tissues, account for all processes undergone by a molecule that traverses a 'living' biological membrane, a cell or tissue.
View Article and Find Full Text PDFJ Comput Aided Mol Des
June 2007
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, UK.
Synthetic chemistry is hard because some reasonable looking molecules cannot be made, because there are errors in the chemical literature, because it is easy to miss reaction possibilities and because even the shape of molecules is very difficult to determine. We propose an approach to the computational analysis of reactions that tries to circumvent these difficulties, by restricting the analysis to simple rules for reactivity that can generate a large number of competing pathways. This huge ensemble is filtered using computational methods to pick out the most likely pathways, and to suggest possible products.
View Article and Find Full Text PDFChem Commun (Camb)
June 2007
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, U.K.
DFT computed transition states quantitatively explain the surprising stereochemical outcome of unsubstituted enolborinates in diastereoselective and enantioselective boron aldol reactions.
View Article and Find Full Text PDFJ Mol Biol
May 2007
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Rd, Cambridge CB2 1EW, UK.
The concept of reaction similarity has been well studied in terms of the overall transformation associated with a reaction, but not in terms of mechanism. We present the first method to give a quantitative measure of the similarity of reactions based upon their explicit mechanisms. Two approaches are presented to measure the similarity between individual steps of mechanisms: a fingerprint-based approach that incorporates relevant information on each mechanistic step; and an approach based only on bond formation, cleavage and changes in order.
View Article and Find Full Text PDFJ Comput Aided Mol Des
May 2007
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, UK.
We investigate the classification performance of circular fingerprints in combination with the Naive Bayes Classifier (MP2D), Inductive Logic Programming (ILP) and Support Vector Inductive Logic Programming (SVILP) on a standard molecular benchmark dataset comprising 11 activity classes and about 102,000 structures. The Naive Bayes Classifier treats features independently while ILP combines structural fragments, and then creates new features with higher predictive power. SVILP is a very recently presented method which adds a support vector machine after common ILP procedures.
View Article and Find Full Text PDFBMC Bioinformatics
January 2007
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Rd, Cambridge CB2 1EW, UK.
Background: The need for fast and accurate scoring functions has been driven by the increased use of in silico virtual screening twinned with high-throughput screening as a method to rapidly identify potential candidates in the early stages of drug development. We examine the ability of some the most common scoring functions (GOLD, ChemScore, DOCK, PMF, BLEEP and Consensus) to discriminate correctly and efficiently between active and non-active compounds among a library of approximately 3,600 diverse decoy compounds in a virtual screening experiment against heat shock protein 90 (Hsp90).
Results: Firstly, we investigated two ranking methodologies, GOLDrank and BestScorerank.
J Chem Inf Model
May 2007
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK.
Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueous solubility more accurately than those created by PLS, SVM, and ANN and offered methods for automatic descriptor selection, an assessment of descriptor importance, and an in-parallel measure of predictive ability, all of which serve to recommend its use. The prediction of log molar solubility for an external test set of 330 molecules that are solid at 25 degrees C gave an r2 = 0.
View Article and Find Full Text PDFJ Chem Inf Model
February 2007
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, United Kingdom.
We have applied the k-nearest neighbor (kNN) modeling technique to the prediction of melting points. A data set of 4119 diverse organic molecules (data set 1) and an additional set of 277 drugs (data set 2) were used to compare performance in different regions of chemical space, and we investigated the influence of the number of nearest neighbors using different types of molecular descriptors. To compute the prediction on the basis of the melting temperatures of the nearest neighbors, we used four different methods (arithmetic and geometric average, inverse distance weighting, and exponential weighting), of which the exponential weighting scheme yielded the best results.
View Article and Find Full Text PDFJ Chem Inf Model
February 2007
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom.
Representative molecules from 10 classes of prohibited substances were taken from the World Anti-Doping Agency (WADA) list, augmented by molecules from corresponding activity classes found in the MDDR database. Together with some explicitly allowed compounds, these formed a set of 5245 molecules. Five types of fingerprints were calculated for these substances.
View Article and Find Full Text PDFOrg Lett
September 2006
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom.
We report theoretical studies into the remote 1,5-stereoinduction shown by certain types of beta-alkoxy methyl ketones in boron-mediated aldol reactions with achiral aldehydes. For a range of common alkoxy groups, our calculations are in excellent agreement with experimentally observed diastereoselectivities. In the aldol transition structures, a stabilizing hydrogen bond between the alkoxy oxygen and formyl proton leads to preferential formation of the 1,5-adduct, by minimizing steric interactions between the beta-alkyl group and one of the ligands on boron.
View Article and Find Full Text PDFJ Chem Inf Model
September 2006
Unilever Centre for Molecular Science Informatics, University Chemical Laboratory, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK.
A program to predict organic reactions, ROBIA, has been developed. It achieves reaction prediction on the basis of coded rules and molecular modeling calculations, generating possible transition states, intermediates, and products given the starting material and reaction conditions. The program generates all possible reaction pathways, on the basis of the selected transformations within its database, and evaluates them selecting the most feasible ones.
View Article and Find Full Text PDFIDrugs
March 2006
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK.
Circular fingerprints -- the representation of molecular structures by atom neighborhoods -- have been applied to a wide range of applications, such as similarity searching and the prediction of absorption, distribution, metabolism, excretion and toxicity properties. In recent years there has been a surge in applications resulting from the superior performance of circular fingerprints in comparative studies. This feature examines the nature of circular fingerprints as well as their applications, including virtual screening, metabolism prediction and the estimation of pK((a)) constants.
View Article and Find Full Text PDFAnnu Rep Comput Chem
November 2006
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK.
This chapter discusses recent developments in some of the areas that exploit the molecular similarity principle, novel approaches to capture molecular properties by the use of novel descriptors, focuses on a crucial aspect of computational models-their validity, and discusses additional ways to examine data available, such as those from high-throughput screening (HTS) campaigns and to gain more knowledge from this data. The chapter also presents some of the recent applications of methods discussed focusing on the successes of virtual screening applications, database clustering and comparisons (such as drug- and in-house-likeness), and the recent large-scale validations of docking and scoring programs. While a great number of descriptors and modeling methods has been proposed until today, the recent trend toward proper model validation is very much appreciated.
View Article and Find Full Text PDFBioinformatics
December 2005
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge Lensfield Road, Cambridge, CB2 1EW, UK.
Summary: MACiE (mechanism, annotation and classification in enzymes) is a publicly available web-based database, held in CMLReact (an XML application), that aims to help our understanding of the evolution of enzyme catalytic mechanisms and also to create a classification system which reflects the actual chemical mechanism (catalytic steps) of an enzyme reaction, not only the overall reaction.
Availability: http://www-mitchell.ch.
J Chem Inf Model
December 2005
Unilever Centre for Molecular Science Informatics, Chemistry Department, University of Cambridge, Cambridge CB2 1EW, United Kingdom.
We have performed virtual screening using some very simple features, by employing the number of atoms per element as molecular descriptors but without regard to any structural information whatsoever. Surprisingly, these atom counts are able to outperform virtual-affinity-based fingerprints and Unity fingerprints in some activity classes. Although molecular weight and other biases were known in target-based virtual screening settings (docking), we report the effect of using very simple descriptors for ligand-based virtual screening, by using clearly defined biological targets and employing a large data set (>100,000 compounds) containing multiple (11) activity classes.
View Article and Find Full Text PDFJ Biomol Screen
October 2005
Unilever Centre for Molecular Science Informatics, Department of Chemistry, Lensfield Road, University of Cambridge, Cambridge CB2 1EW, United Kingdom.
A fragment-based similarity searching method, MOLPRINT 2D, was employed for virtual screening of Escherichia coli dihydrofolate reductase inhibitors. Using the original training set of 50,000 compounds, only marginal enrichment factors (between 1 and 3) could be achieved on the test library. The active structures contained in the training and test libraries represented different types of "chemistry", that is, different substructural features associated with activity.
View Article and Find Full Text PDFOrg Lett
August 2005
University Chemical Laboratory, Unilever Centre for Molecular Science Informatics, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
A reaction prediction program, ROBIA, has been developed. This interactive computer program predicts the products of organic reactions from the starting materials and the reaction conditions, on the basis of the selected transformations within its database. This mechanistic approach generates a large number of products, from which the most important are selected using filters and molecular modeling calculations.
View Article and Find Full Text PDF