A revised annotation and comparative analysis of Helicobacter pylori genomes.

Nucleic Acids Res

Unité de Pathogénie Bactérienne des Muqueuses, Institut Pasteur, Paris, France.

Published: March 2003

AI Article Synopsis

  • A large volume of genomic data is being produced, creating a need for organized and comparative databases for biologists, leading to the development of the PyloriGene database for Helicobacter pylori strains.
  • The database integrates sequencing information from two strains and addresses differences in their functional annotations by creating a revised classification system, which has improved the understanding of coding sequences.
  • Notably, the study found that significant size variations among coding sequences were mostly due to natural polymorphisms and differences in start codon assignments, with 108 out of 1658 coding sequences having their functions reassigned based on updated data.

Article Abstract

Huge amounts of genomic information are currently being generated. Therefore, biologists require structured, exhaustive and comparative databases. The PyloriGene database (http://genolist.pasteur.fr/PyloriGene) was developed to respond to these needs, by integrating and connecting the information generated during the sequencing of two distinct strains of Helicobacter pylori. This led to the need for a general annotation consensus, as the physical and functional annotations of the two strains differed significantly in some cases. A revised functional classification system was created to accommodate the existing data and to make it possible to classify coding sequences (CDS) into several functional categories to harmonize CDS classification. The annotation of the two complete genomes was revised in the light of new data, allowing us to reduce the percentage of hypothetical proteins from approximately 40 to 33%. This resulted in the reassignment of functions for 108 CDS (approximately 7% of all CDS). Interestingly, the functions of only approximately 13% of CDS (222 out of 1658 CDS) were annotated as a result of work done directly on H.pylori genes. Finally, comparison of the two published genomes revealed a significant amount of size variation between corresponding (orthologous) CDS. Most of these size variations were due to natural polymorphisms, although other sources of variation were identified, such as pseudogenes, new genes potentially regulated by slipped-strand mispairing mechanism, or frame-shifts. 113 of these differences were due to different start codon assignments, a common problem when constructing physical annotations.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC152854PMC
http://dx.doi.org/10.1093/nar/gkg250DOI Listing

Publication Analysis

Top Keywords

helicobacter pylori
8
cds
7
revised annotation
4
annotation comparative
4
comparative analysis
4
analysis helicobacter
4
pylori genomes
4
genomes huge
4
huge amounts
4
amounts genomic
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!