Background: Sequence-binning techniques enable the recovery of an increasing number of genomes from complex microbial metagenomes and typically require prior metagenome assembly, incurring the computational cost and drawbacks of the latter, e.g., biases against low-abundance genomes and inability to conveniently assemble multi-terabyte datasets.
Results: We present here a scalable pre-assembly binning scheme (i.e., operating on unassembled short reads) enabling latent genome recovery by leveraging sparse dictionary learning and elastic-net regularization, and its use to recover hundreds of metagenome-assembled genomes, including very low-abundance genomes, from a joint analysis of microbiomes from the LifeLines DEEP population cohort (n = 1,135, >1010 reads).
Conclusion: We showed that sparse coding techniques can be leveraged to carry out read-level binning at large scale and that, despite lower genome reconstruction yields compared to assembly-based approaches, bin-first strategies can complement the more widely used assembly-first protocols by targeting distinct genome segregation profiles. Read enrichment levels across 6 orders of magnitude in relative abundance were observed, indicating that the method has the power to recover genomes consistently segregating at low levels.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7099633 | PMC |
http://dx.doi.org/10.1093/gigascience/giaa028 | DOI Listing |
Plant Divers
November 2024
CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, PR China.
Genome skimming has dramatically extended DNA barcoding from short DNA fragments to next generation barcodes in plants. However, conserved DNA barcoding markers, including complete plastid genome and nuclear ribosomal DNA (nrDNA) sequences, are inadequate for accurate species identification. Skmer, a recently proposed approach that estimates genetic distances among species based on unassembled genome skims, has been proposed to effectively improve species discrimination rate.
View Article and Find Full Text PDFThe naked mole-rat (NMR; ) is a eusocial subterranean rodent with a highly unusual set of physiological traits that has attracted great interest amongst the scientific community. However, the genetic basis of most of these traits has not been elucidated. To facilitate our understanding of the molecular mechanisms underlying NMR physiology and behaviour, we generated a long-read chromosomal-level genome assembly of the NMR.
View Article and Find Full Text PDFBMC Genomics
November 2024
College of Life Sciences, Shaanxi Normal University, Xi'an, China.
Background: In evolutionary biology, identifying and quantifying inter-lineage genome size variation and elucidating the underlying causes of that variation have long been goals. Repetitive elements (REs) have been proposed and confirmed as being among the most important contributors to genome size variation. However, the evolutionary implications of genome size variation and RE dynamics are not well understood.
View Article and Find Full Text PDFmSystems
November 2024
Institute of Medical Microbiology and Hospital Hygiene, Heinrich Heine University, Düsseldorf, Germany.
BMC Genomics
January 2024
Faculty of Biology, Technische Universität Dresden, D-01069, Dresden, Germany.
Background: Despite the many cheap and fast ways to generate genomic data, good and exact genome assembly is still a problem, with especially the repeats being vastly underrepresented and often misassembled. As short reads in low coverage are already sufficient to represent the repeat landscape of any given genome, many read cluster algorithms were brought forward that provide repeat identification and classification. But how can trustworthy, reliable and representative repeat consensuses be derived from unassembled genomes?
Results: Here, we combine methods from repeat identification and genome assembly to derive these robust consensuses.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!