The ability to search for a query molecule on massive molecular repositories is a fundamental task in chemoinformatics and drug-discovery. Chemical fingerprints are commonly used to characterize the structure and properties of molecules. Some fingerprints, particularly unfolded fingerprints, are often of extreme high dimension and sparse where only few features have a positive value.
View Article and Find Full Text PDFThe inherent conflict between noncovalent interactions and the large conformational entropy of the polypeptide chain forces folding reactions and their mechanisms to deviate significantly from chemical reactions. Accordingly, measures of structure in the transition state ensemble (TSE) are strongly influenced by the underlying distributions of microscopic folding pathways that are challenging to discern experimentally. Here, we present a detailed analysis of 150,000 folding transition paths of five proteins at three different thermodynamic conditions from an experimentally consistent statistical mechanical model.
View Article and Find Full Text PDFMotivation: The ability to predict pathways for biosynthesis of metabolites is very important in metabolic engineering. It is possible to mine the repertoire of biochemical transformations from reaction databases, and apply the knowledge to predict reactions to synthesize new molecules. However, this usually involves a careful understanding of the mechanism and the knowledge of the exact bonds being created and broken.
View Article and Find Full Text PDFHow many structurally different microscopic routes are accessible to a protein molecule while folding? This has been a challenging question to address experimentally as single-molecule studies are constrained by the limited number of observed folding events while ensemble measurements, by definition, report only an average and not the distribution of the quantity under study. Atomistic simulations, on the other hand, are restricted by sampling and the inability to reproduce thermodynamic observables directly. We overcome these bottlenecks in the current work and provide a quantitative description of folding pathway heterogeneity by developing a comprehensive, scalable and yet experimentally consistent approach combining concepts from statistical mechanics, physical kinetics and graph theory.
View Article and Find Full Text PDFWe show that the phosphorylation of 4E-BP2 acts as a triggering event to shape its folding-function landscape that is delicately balanced between conflicting favorable energetics and intrinsically unfavorable topological connectivity. We further provide first evidence that the fitness landscapes of proteins at the threshold of disorder can differ considerably from ordered domains.
View Article and Find Full Text PDFIdentifying the overrepresented substructures from a set of molecules with similar activity is a common task in chemical informatics. Existing substructure miners are deterministic, requiring the activity of all mined molecules to be known with high confidence. In contrast, we introduce pGraphSig, a probabilistic structure miner, which effectively mines structures from noisy data, where many molecules are labeled with their probability of being active.
View Article and Find Full Text PDFJ Chem Inf Model
May 2011
We propose a novel method for pharmacophore analysis by examining the Joint Pharmacophore Space of chemical compounds, targets, and chemical/biological properties. The proposed approach is a notable deviation from existing techniques that analyze compounds on a target-by-target basis, aimed at extracting and optimizing a specific pharmacophore. The underlying geometry of the pharmacophores is responsible for binding between compounds and targets as well as properties of compounds such as Blood Brain Barrier permeability.
View Article and Find Full Text PDFJ Chem Inf Model
November 2009
The increased availability of large repositories of chemical compounds has created new challenges in designing efficient molecular querying and mining systems. Molecular classification is an important problem in drug development where libraries of chemical compounds are screened and molecules with the highest probability of success against a given target are selected. We have developed a technique called GraphSig to mine significantly over-represented molecular substructures in a given class of molecules.
View Article and Find Full Text PDF