Background: In systems biology, there is an acute need for integrative approaches in heterogeneous network mining in order to exploit the continuous flux of genomic data. Simultaneous analysis of the metabolic pathways and genomic context of a given species leads to the identification of patterns consisting in reaction chains catalyzed by products of neighboring genes. Similar such patterns across several species can reveal their mode of conservation throughout the tree of life.
View Article and Find Full Text PDFBackground: Enzymes belonging to mechanistically diverse superfamilies often display similar catalytic mechanisms. We previously observed such an association in the case of the cyclic amidohydrolase superfamily whose members play a role in related steps of purine and pyrimidine metabolic pathways. To establish a possible link between enzyme homology and chemical similarity, we investigated further the neighbouring steps in the respective pathways.
View Article and Find Full Text PDFDihydroorotases are universal proteins catalyzing the third step of pyrimidine biosynthesis. These zinc metalloenzymes belong to the superfamily of cyclic amidohydrolases, comprising also other enzymes that are involved in degradation of either purines (allantoinases), pyrimidines (dihydropyrimidinases) or hydantoins (hydantoinases). The evolutionary relationships between these mechanistically related enzymes were estimated after designing a method to build an accurate multiple sequence alignment.
View Article and Find Full Text PDFBackground: More and more completely sequenced fungal genomes are becoming available and many more sequencing projects are in progress. This deluge of data should improve our knowledge of the various primary and secondary metabolisms of Fungi, including their synthesis of useful compounds such as antibiotics or toxic molecules such as mycotoxins. Functional annotation of many fungal genomes is imperfect, especially of genes encoding enzymes, so we need dedicated tools to analyze their metabolic pathways in depth.
View Article and Find Full Text PDFContrary to a widespread opinion, horizontal gene transfer (HGT) between distantly related microorganisms (such as Bacteria and Archaea) has not been demonstrated to occur on a large scale. Except for transfer of mobile elements between closely related organisms, most alleged HGT events reflect phylogenetic discrepancies that can be explained by a variety of artefacts or by the differential loss of paralogous gene copies either originally present in the Last Universal Common Ancestor (LUCA) to the three Domains (a sophisticated, genetically redundant and promiscuous community of protoeukaryotes), or created by duplications having occurred at later times. Besides, (i) there is no experimental evidence for the facile acquisition of foreign DNA between distant taxa and (ii) important biological constraints operate on the phenotypic success of genetic exchange at several levels, including protein-protein interactions involved in metabolic channelling; stable integration and expression of foreign DNA is, therefore, expected to require strong selection.
View Article and Find Full Text PDFA complete tree with roots, trunk and crown remains an appropriate model to represent all steps of life's development, from the emergence of a unique genetic code up to the last universal common ancestor and its further radiation. Catalytic closure of a mixture of prebiotic polymers is a heuristic alternative to the RNA world. Conjectures about emergence of life in an infinite multiverse should not confuse probability with possibility.
View Article and Find Full Text PDFBMC Bioinformatics
December 2008
Background: It has been repeatedly observed that gene order is rapidly lost in prokaryotic genomes. However, persistent synteny blocks are found when comparing more or less distant species. These genes that remain consistently adjacent are appealing candidates for the study of genome evolution and a more accurate definition of their functional role.
View Article and Find Full Text PDFBackground: Curated databases of completely sequenced genomes have been designed independently at the NCBI (RefSeq) and EBI (Genome Reviews) to cope with non-standard annotation found in the version of the sequenced genome that has been published by databanks GenBank/EMBL/DDBJ. These curation attempts were expected to review the annotations and to improve their pertinence when using them to annotate newly released genome sequences by homology to previously annotated genomes. However, we observed that such an uncoordinated effort has two unwanted consequences.
View Article and Find Full Text PDFComp Funct Genomics
June 2010
Pertinent evolutionary studies are based on a correct use of homology terms such as paralogues, metalogues and orthologues. Such crucial concepts have been applied to intragenomic and intergenomic analyses. A further requisite is a proper definition of what is a structural segment of homology.
View Article and Find Full Text PDFBackground: Since the reclassification of all life forms in three Domains (Archaea, Bacteria, Eukarya), the identity of their alleged forerunner (Last Universal Common Ancestor or LUCA) has been the subject of extensive controversies: progenote or already complex organism, prokaryote or protoeukaryote, thermophile or mesophile, product of a protracted progression from simple replicators to complex cells or born in the cradle of "catalytically closed" entities? We present a critical survey of the topic and suggest a scenario.
Results: LUCA does not appear to have been a simple, primitive, hyperthermophilic prokaryote but rather a complex community of protoeukaryotes with a RNA genome, adapted to a broad range of moderate temperatures, genetically redundant, morphologically and metabolically diverse. LUCA's genetic redundancy predicts loss of paralogous gene copies in divergent lineages to be a significant source of phylogenetic anomalies, i.
Motivation: We have to cope with both a deluge of new genome sequences and a huge amount of data produced by high-throughput approaches used to exploit these genomic features. Crossing and comparing such heterogeneous and disparate data will help improving functional annotation of genomes. This requires designing elaborate integration systems such as warehouses for storing and querying these data.
View Article and Find Full Text PDFBackground: Comparison of completely sequenced microbial genomes has revealed how fluid these genomes are. Detecting synteny blocks requires reliable methods to determining the orthologs among the whole set of homologs detected by exhaustive comparisons between each pair of completely sequenced genomes. This is a complex and difficult problem in the field of comparative genomics but will help to better understand the way prokaryotic genomes are evolving.
View Article and Find Full Text PDFThe incredible development of comparative genomics during the last decade has required a correct use of the concept of homology that was previously utilized only by evolutionary biologists. Unhappily, this concept has been often misunderstood and thus misused when exploited outside its evolutionary context. This review brings back to the correct definition of homology and explains how this definition has been progressively refined in order to adapt it to the various new kinds of analysis of gene properties and of their products that appear with the progress of comparative genomics.
View Article and Find Full Text PDFMajor aspects of the pathway of de novo arginine biosynthesis via acetylated intermediates in microorganisms must be revised in light of recent enzymatic and genomic investigations. The enzyme N-acetylglutamate synthase (NAGS), which used to be considered responsible for the first committed step of the pathway, is present in a limited number of bacterial phyla only and is absent from Archaea. In many Bacteria, shorter proteins related to the Gcn5-related N-acetyltransferase family appear to acetylate l-glutamate; some are clearly similar to the C-terminal, acetyl-coenzyme A (CoA) binding domain of classical NAGS, while others are more distantly related.
View Article and Find Full Text PDFBackground: Despite the current availability of several hundreds of thousands of amino acid sequences, more than 36% of the enzyme activities (EC numbers) defined by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) are not associated with any amino acid sequence in major public databases. This wide gap separating knowledge of biochemical function and sequence information is found for nearly all classes of enzymes. Thus, there is an urgent need to explore these sequence-less EC numbers, in order to progressively close this gap.
View Article and Find Full Text PDFDespite the immense progress of genomics, and the current availability of several hundreds of thousands of amino acid sequences, >39% of well-defined enzyme activities (as represented by enzyme commission, EC, numbers) are not associated with any sequence. There is an urgent need to explore the 1525 orphan enzymes (enzymes having EC numbers without an associated sequence) to bridge the wide gap that separates knowledge of biochemical function and sequence information. Strikingly, orphan enzymes can even be found among enzymatic activities successfully used as drug targets.
View Article and Find Full Text PDFDespite the current availability of several hundreds of thousands of amino acid sequences, more than 39% of the well-defined enzyme activities (EC numbers) are not associated with any sequence in major public databases. This wide gap separating knowledge of biochemical function and sequence information is found in nearly all classes of enzymes. Thus, there is an urgent need to explore the 1525 orphan enzymes (EC numbers without associated sequences), in order to progressively bridge this unwanted gap.
View Article and Find Full Text PDFBackground: The N-acetylation of L-glutamate is regarded as a universal metabolic strategy to commit glutamate towards arginine biosynthesis. Until recently, this reaction was thought to be catalyzed by either of two enzymes: (i) the classical N-acetylglutamate synthase (NAGS, gene argA) first characterized in Escherichia coli and Pseudomonas aeruginosa several decades ago and also present in vertebrates, or (ii) the bifunctional version of ornithine acetyltransferase (OAT, gene argJ) present in Bacteria, Archaea and many Eukaryotes. This paper focuses on a new and surprising aspect of glutamate acetylation.
View Article and Find Full Text PDFBackground: Annotating genomes remains an hazardous task. Mistakes or gaps in such a complex process may occur when relevant knowledge is ignored, whether lost, forgotten or overlooked. This paper exemplifies an approach which could help to resuscitate such meaningful data.
View Article and Find Full Text PDFMany evolutionary scenarios describing the history of proteins are based solely on phylogenetic studies. We have designed a new approach that allows ascertainment of such questionable scenarios by taking into account quaternary structures: we used aspartate carbamoyltransferase (ATCase) as a case study. Prokaryotic ATCases correspond to different classes of quaternary structures according to the mode of association of the catalytic PyrB subunit with other polypeptides, either the PyrI regulatory subunit (class B) or a dihydroorotase (class A), which may be active (PyrC, subclass A1) or inactive (PyrC', subclass A2).
View Article and Find Full Text PDFNo gene coding for an adenine deaminase has been described in eukaryotes. However, physiological and genetical evidence indicates that adenine deaminases are present in the ascomycetes. We have cloned and characterised the genes coding for the adenine deaminases of Aspergillus nidulans, Saccharomyces cerevisiae and Schizosaccharomyces pombe.
View Article and Find Full Text PDFProtein homology is often limited to long structural segments that we have previously called modules. We describe here a suite of programs used to catalog the whole set of modules present in microbial proteomes. First, the Darwin AllAll program detects homologous segments using thresholds for evolutionary distance and alignment length, and another program classifies these modules.
View Article and Find Full Text PDF