The de novo assembly of next generation sequencing data is a daunting task made more difficult by the presence of genomic repeats or transposable elements, resulting in an increasing number of genomes designated as completed draft assemblies. We created and assembled idealized sequence data sets for Cupriavidus metallidurans CH34, Caulobacter sp. K31, Gramella forsetii KT0803, Rhodobacter sphaeroides 2.4.1 and Bordetella bronchiseptica RB50. In addition to confirming the role of transposable elements in interrupting the assemblies, an association was found between the most fragmented regions and known or predicted genomic islands in these strains. Assembly quality was more strongly related to putative genomic island content than to any other factor examined. We believe this association indicates that draft assemblies are limiting our ability to understand the genomic context of important bacterial adaptations and that the increased effort required for finishing genomes can provide a wealth of information for future studies.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.ygeno.2012.06.009 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!