Unlabelled: Inexpensive de novo genome sequencing, particularly in organisms with small genomes, is now possible using several new sequencing technologies. Some of these technologies such as that from Illumina's Solexa Sequencing, produce high genomic coverage by generating a very large number of small reads ( approximately 30 bp). While prior work shows that partial assembly can be performed by k-mer extension in error-free reads, this algorithm is unsuccessful with the sequencing error rates found in practice. We present VCAKE (Verified Consensus Assembly by K-mer Extension), a modification of simple k-mer extension that overcomes error by using high depth coverage. Though it is a simple modification of a previous approach, we show significant improvements in assembly results on simulated and experimental datasets that include error.
Availability: http://152.2.15.114/~labweb/VCAKE
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bioinformatics/btm451 | DOI Listing |
Microbiome
October 2024
Quantitative and Computational Biology, Max-Planck Institute for Multidisciplinary Sciences, Göttingen, Germany.
Background: Metagenomics is a powerful approach to study environmental and human-associated microbial communities and, in particular, the role of viruses in shaping them. Viral genomes are challenging to assemble from metagenomic samples due to their genomic diversity caused by high mutation rates. In the standard de Bruijn graph assemblers, this genomic diversity leads to complex k-mer assembly graphs with a plethora of loops and bulges that are challenging to resolve into strains or haplotypes because variants more than the k-mer size apart cannot be phased.
View Article and Find Full Text PDFProc (IEEE Int Conf Healthc Inform)
June 2024
dept. of CISE, University of Florida, Gainesville, FL (USA).
A problem extension of the longest common substring (LCS) between two texts is the enumeration of all LCSs given a minimum length (ALCS- ), along with their positions in each text. In bioinformatics, an efficient solution to the ALCS- for very long texts -genomes or metagenomes- can provide useful insights to discover genetic signatures responsible for biological mechanisms. The ALCS- problem has two additional requirements compared to the LCS problem: one is the minimum length , and the other is that all common strings longer than must be reported.
View Article and Find Full Text PDFG3 (Bethesda)
September 2024
Department of Horticulture, Washington State University, Pullman, WA, 99164, USA.
Genome sequencing for agriculturally important Rosaceous crops has made rapid progress both in completeness and annotation quality. Whole genome sequence and annotation gives breeders, researchers, and growers information about cultivar specific traits such as fruit quality and disease resistance, and informs strategies to enhance postharvest storage. Here we present a haplotype-phased, chromosomal level genome of Malus domestica, 'WA 38', a new apple cultivar released to market in 2017 as Cosmic Crisp®.
View Article and Find Full Text PDFGenome Res
July 2023
Department of Mathematics, University of Toronto, Toronto, Ontario M5S 2E4, Canada.
Seed-chain-extend with -mer seeds is a powerful heuristic technique for sequence alignment used by modern sequence aligners. Although effective in practice for both runtime and accuracy, theoretical guarantees on the resulting alignment do not exist for seed-chain-extend. In this work, we give the first rigorous bounds for the efficacy of seed-chain-extend with -mers Assume we are given a random nucleotide sequence of length ∼ that is indexed (or seeded) and a mutated substring of length ∼ ≤ with mutation rate θ < 0.
View Article and Find Full Text PDFPlants (Basel)
December 2022
Jiangsu Academy of Forestry, 109 Danyang Road, Dongshanqiao, Nanjing 211153, China.
Sims. is an evergreen arbor species with high ornamental and medicinal value that is widely distributed in China. However, there is a lack of molecular and genomic data for this plant, which severely restricts the development of its relevant research.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!