Resolving the structural features of genomic islands: a machine learning approach.

Genome Res

The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.

Published: February 2008

Large inserts of horizontally acquired DNA that contain functionally related genes with limited phylogenetic distribution are often referred to as genomic islands (GIs), and structural definitions of these islands, based on common features, have been proposed. Although a large number of mobile elements fall well within the GI definition, there are several concerns about the structural consensus for GIs: The current GI definition was put forward 10 yr ago when only 12 complete bacterial genomes were available, a large number of GIs deviate from that definition, and in silico predictions assuming a full/partial GI structural model bias the sampling of the GI structural space toward "well-structured" GIs. In this study, the structural features of genomic regions are sampled by a hypothesis-free, bottom-up search, and these are exploited in a machine learning approach with the aim of explicitly quantifying and modeling the contribution of each feature to the GI structure. Performing a whole-genome-based comparative analysis between 37 strains of three different genera and 12 outgroup genomes, 668 genomic regions were sampled and used to train structural GI models. The data show that, overall, GIs from the three different genera fall into distinct, genus-specific structural families. However, decreasing the taxa resolution, by studying GI structures across different genus boundaries, provides models that converge on a fairly similar GI structure, further suggesting that GIs can be seen as a superfamily of mobile elements, with core and variable structural features, rather than a well-defined family.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2203631PMC
http://dx.doi.org/10.1101/gr.7004508DOI Listing

Publication Analysis

Top Keywords

structural features
12
features genomic
8
genomic islands
8
machine learning
8
learning approach
8
structural
8
large number
8
mobile elements
8
genomic regions
8
regions sampled
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!