Objectives: Smoking is considered a major risk factor for periodontitis genesis and progression. In clinical studies, specific indicators have been used to characterize the smoking status of the patient as the number of cigarettes consumed (NCC), the pack-years (PY), or Fagerström Test for Nicotine Dependence (FTND). However, available literature is missing on the relationship between cotinine gingival intoxication and smoking indicators.
View Article and Find Full Text PDFWe present a thorough analysis of the relation between amino acid sequence and local three-dimensional structure in proteins. A library of overlapping local structural prototypes was built using an unsupervised clustering approach called "hybrid protein model" (HPM). The HPM carries out a multiple structural alignment of local folds from a non-redundant protein structure databank encoded into a structural alphabet composed of 16 protein blocks (PBs).
View Article and Find Full Text PDFThe description of protein 3D structures can be performed through a library of 3D fragments, named a structural alphabet. Our structural alphabet is composed of 16 small protein fragments of 5 C alpha in length, called protein blocks (PBs). It allows an efficient approximation of the 3D protein structures and a correct prediction of the local structure.
View Article and Find Full Text PDFProtein Peeling 2 (PP2) is a web server for the automatic identification of protein units (PUs) given the 3D coordinates of a protein. PUs are an intermediate level of protein structure description between protein domains and secondary structures. It is a new tool to better understand and analyze the organization of protein structures.
View Article and Find Full Text PDFMotivation: Molecular evolution, which is classically assessed by comparison of individual proteins or genes between species, can now be studied by comparing co-expressed functional groups of genes. This approach, which better reflects the functional constraints on the evolution of organisms, can exploit the large amount of data generated by genome-wide expression analyses. However, it requires new methodologies to represent the data in a more accessible way for cross-species comparisons.
View Article and Find Full Text PDFWe developed a novel approach for predicting local protein structure from sequence. It relies on the Hybrid Protein Model (HPM), an unsupervised clustering method we previously developed. This model learns three-dimensional protein fragments encoded into a structural alphabet of 16 protein blocks (PBs).
View Article and Find Full Text PDFMotivation: The object of this study is to propose a new method to identify small compact units that compose protein three-dimensional structures. These fragments, called 'protein units (PU)', are a new level of description to well understand and analyze the organization of protein structures. The method only works from the contact probability matrix, i.
View Article and Find Full Text PDFThree-dimensional protein structures can be described with a library of 3D fragments that define a structural alphabet. We have previously proposed such an alphabet, composed of 16 patterns of five consecutive amino acids, called Protein Blocks (PBs). These PBs have been used to describe protein backbones and to predict local structures from protein sequences.
View Article and Find Full Text PDFA statistical analysis of the PDB structures has led us to define a new set of small 3D structural prototypes called Protein Blocks (PBs). This structural alphabet includes 16 PBs, each one is defined by the (phi, psi) dihedral angles of 5 consecutive residues. The amino acid distributions observed in sequence windows encompassing these PBs are used to predict by a Bayesian approach the local 3D structure of proteins from the sole knowledge of their sequences.
View Article and Find Full Text PDFThis study investigates the GM genetic relationships of 82 human populations, among which 10 represent original data, within and among the main broad geographic areas of the world. Different approaches are used: multidimensional scaling analysis and test for isolation by distance, to assess the correlation between genetic variation and spatial distributions; analysis of variance, to investigate the genetic structure at different hierarchical levels of population subdivision; genetic similarity map (geographic map distorted by available genetic information), to identify regions of high and low genetic variation; and minimal spanning network, to point out possible migration routes across continental areas. The results show that the GM polymorphism is characterized by one of the highest amounts of genetic variation observed so far among populations of different continents (Fct=0.
View Article and Find Full Text PDFBackground: Microarray technologies produced large amount of data. The hierarchical clustering is commonly used to identify clusters of co-expressed genes. However, microarray datasets often contain missing values (MVs) representing a major drawback for the use of the clustering methods.
View Article and Find Full Text PDFBioinformatics
February 2003
Motivation: Our aim is to develop a process that automatically defines a repertory of contiguous 3D protein structure fragments and can be used in homology modeling. We present here improvements to the method we introduced previously: the 'hybrid protein model' (de Brevern and Hazout, THEOR: Chem. Acc.
View Article and Find Full Text PDFProtein Blocks (PBs) comprise a structural alphabet of 16 protein fragments, each 5 Calpha long. They make it possible to approximate and correctly predict local protein three-dimensional (3D) structures. We have selected the 72 most frequent sequences of five PBs, which we call Structural Words (SWs).
View Article and Find Full Text PDFThe aim of this paper is to present a new approach, called 'Hybrid Chromosome Model' (HXM), which allows both the extraction of regions of similarity between two sequences, and the compartimentation of a set of DNA sequences. The principle of the method consists in compacting a set of sequences (split into fragments of fixed length) into a 'hybrid chromosome', which results from the stacking of the whole sequence fragments. We have illustrated our approach on the 32 subtelomeres of Saccharomyces cerevisae.
View Article and Find Full Text PDFMotivation: Locating the regions of similarity in a genome requires the availability of appropriate tools such as 'Accelerated Search for SImilar Regions in Chromosomes' (ASSIRC; Vincens et al., Bioinformatics, 14, 715-725, 1998). The aim of this paper is to present different strategies for improving this program by distributing the operations and data to multiple processing units and to assess the efficiency of the different implementations in terms of running time as a function of the number of processing units.
View Article and Find Full Text PDFKnowledge of the disulfide bonding state of the cysteines of proteins is of major interest in designing numerous molecular biology experiments, or in predicting their three-dimensional structure. Previous methods using the information gained from aligned sets of sequences have reached up to 82% of success in predicting the oxidation state of cysteines. In the present study, we assess the relative efficiency of different descriptors in predicting the cysteine disulfide bonding states.
View Article and Find Full Text PDFUnlabelled: MOSAIC is a set of tools for the segmentation of multiple aligned DNA sequences into homogeneous zones. The segmentation is based on the distribution of mutational events along the alignment. As an example, the analysis of one repeated sequence belonging to the subtelomeric regions of the yeast genome is presented.
View Article and Find Full Text PDFBy using an unsupervised cluster analyzer, we have identified a local structural alphabet composed of 16 folding patterns of five consecutive C(alpha) ("protein blocks"). The dependence that exists between successive blocks is explicitly taken into account. A Bayesian approach based on the relation protein block-amino acid propensity is used for prediction and leads to a success rate close to 35%.
View Article and Find Full Text PDFThe hidden Markov model (HMM) was used to identify recurrent short 3D structural building blocks (SBBs) describing protein backbones, independently of any a priori knowledge. Polypeptide chains are decomposed into a series of short segments defined by their inter-alpha-carbon distances. Basically, the model takes into account the sequentiality of the observed segments and assumes that each one corresponds to one of several possible SBBs.
View Article and Find Full Text PDFUnlabelled: PredAcc is a tool for predicting the solvent accessibility of protein residues from the sequence at different relative accessibility levels (0-55%). The prediction rate varies between 70. 7% (for 25% relative accessibility) and 85.
View Article and Find Full Text PDFComput Appl Biosci
October 1997
Motivation: The approaches usually used for building large genetic maps consist of dividing the marker set into linkage groups and provide local orders that can be tested by multi-point linkage analysis. To deal with the limitations of these approaches, a strategy taking the marker set into account globally is defined.
Results: The paper presents a new approach called 'Bi-Dimensional Scaling Map (BDS-Map) for inferring marker orders and distances in genetic maps based on the use of an additional dimension orthogonal to the map into which markers are projected.
When analyzed by origin, the frequency of the G542X cystic fibrosis (CF) mutation (the second most common CF mutation in Europe after DF508) varies between population groups in Europe. We show here that the frequency of G542X varies among different towns or regions of origin, being lower in northeastern Europeans than in southwestern Europeans. The G542X mutation mapping that we have defined by a multiple regression of G542X frequencies covers 28 countries (53 geographic points) and is based on data from 50 laboratories.
View Article and Find Full Text PDFWe have studied the effect of backbone inaccuracy on the efficiency of protein side chain conformation prediction using rotamer libraries. The backbones were generated by randomly perturbing the crystallographic conformation of 12 proteins and exhibit C alpha r.m.
View Article and Find Full Text PDFThe apolipoprotein E gene (APOE) is located on chromosome 19. The three most common APOE alleles account for most of the corresponding peptide chain variations in most human populations. APOE*3 is the most common allele, coding for the product E3; APOE*2 codes for an Arg-158-->Cys substitution (E2), and APOE*4 codes for a Cys-112-->Arg product (E4).
View Article and Find Full Text PDFWe have collected 76 parent-offspring (CAG)n values in 60 French Huntington's disease (HD) pedigrees. The analysis of intergenerational alterations in CAG repeat length shows that there is a correlation between repeat instability and parental repeat length. Paternally inherited cases are characterized by a preferential trend towards an increase in range of repeat sizes in offspring of HD patients.
View Article and Find Full Text PDF