High-quality genome assemblies are crucial to many biological studies, and utilizing long sequencing reads can help achieve higher assembly contiguity. While long reads can resolve complex and repetitive regions of a genome, their relatively high associated error rates are still a major limitation. Long reads generally produce draft genome assemblies with lower base quality, which must be corrected with a genome polishing step. Hybrid genome polishing solutions can greatly improve the quality of long-read genome assemblies by utilizing more accurate short reads to validate bases and correct errors. Currently available hybrid polishing methods rely on read alignments, and are therefore memory-intensive and do not scale well to large genomes. Here we describe ntEdit+Sealer, an alignment-free, k-mer-based genome finishing protocol that employs memory-efficient Bloom filters. The protocol includes ntEdit for correcting base errors and small indels, and for marking potentially problematic regions, then Sealer for filling both assembly gaps and problematic regions flagged by ntEdit. ntEdit+Sealer produces highly accurate, error-corrected genome assemblies, and is available as a Makefile pipeline from https://github.com/bcgsc/ntedit_sealer_protocol. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol: Automated long-read genome finishing with short reads Support Protocol: Selecting optimal values for k-mer lengths (k) and Bloom filter size (b).
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9196995 | PMC |
http://dx.doi.org/10.1002/cpz1.442 | DOI Listing |
BMC Genom Data
January 2025
Key Laboratory of State Forestry and Grassland Administration Conservation and Utilization of Warm Temperate Zone Forest and Grass Germplasm Resources, Shandong Provincial Center of Forest and Grass Germplasm Resources, Ji'nan, 250103, Shandong, China.
Objectives: Toona sinensis, commonly known as Chinese toon, is a perennial woody plant with significant economic and ecological importance. This study employed whole-genome resequencing of 180 T. sinensis samples collected from Shandong to analyze genetic variation and diversity, ultimately identifying 18,231 high-quality SNPs after rigorous quality control and linkage disequilibrium pruning.
View Article and Find Full Text PDFBMC Plant Biol
January 2025
Institute of Tropical Horticulture Research, Hainan Academy of Agricultural Sciences, Haikou, 571100, China.
Background: Tea-oil Camellia within the genus Camellia is renowned for its premium Camellia oil, often described as "Oriental olive oil". So far, only one partial mitochondrial genomes of Tea-oil Camellia have been published (no main Tea-oil Camellia cultivars), and comparative mitochondrial genomic studies of Camellia remain limited.
Results: In this study, we first reconstructed the entire mitochondrial genome of C.
BMC Genomics
January 2025
Department of Food, Bioprocessing, & Nutrition Sciences, North Carolina State University, Raleigh, NC, USA.
Background: The advent of next generation sequencing technologies has enabled a surge in the number of whole genome sequences in public databases, and our understanding of the composition and evolution of bacterial genomes. Besides model organisms and pathogens, some attention has been dedicated to industrial bacteria, notably members of the Lactobacillaceae family that are commonly studied and formulated as probiotic bacteria. Of particular interest is Lactobacillus acidophilus NCFM, an extensively studied strain that has been widely commercialized for decades and is being used for the delivery of vaccines and therapeutics.
View Article and Find Full Text PDFNat Genet
January 2025
Center for Genomics, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou, China.
Modern sugarcane, a highly allo-autopolyploid organism, has a very complex genome. In the present study, the karyotype and genome architecture of modern sugarcane were investigated, resulting in a genome assembly of 97 chromosomes (8.84 Gb).
View Article and Find Full Text PDFNat Commun
January 2025
Institute of Molecular Physiology, Shenzhen Bay Laboratory, Shenzhen, 518132, China.
Nucleosome is the basic structural unit of the genome. During processes like DNA replication and gene transcription, the conformation of nucleosomes undergoes dynamic changes, including DNA unwrapping and rewrapping, as well as histone disassembly and assembly. However, the wrapping characteristics of nucleosomes across the entire genome, including region-specificity and their correlation with higher-order chromatin organization, remains to be studied.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!