dna2bit: high performance genomic distance estimation software for microbial genome analysis.

Front Microbiol

Ministry of Education Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Sciences, Fudan University, Shanghai, China.

Published: December 2024

dna2bit is an ultra-fast software specifically engineered for microbial genome analysis, particularly adept at calculating genome distances within metagenome and single amplified genome datasets. Distinguished from existing software such as Mash and Dashing, dna2bit employs feature hashing technique and Hamming distance to achieve enhanced speed and memory utilization, without sacrifice in the accuracy of average nucleotide identity calculations. dna2bit has promising applications in various domains such as average nucleotide identity approximation, metagenomic sequence clustering, and homology querying. dna2bit significantly boosts computational efficiency in handling large datasets including single amplified genomes, thereby facilitating a better understanding of the population heterogeneity and comparative genomics of microorganisms. dna2bit is available at https://github.com/lijuzeng/dna2bit.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11701053PMC
http://dx.doi.org/10.3389/fmicb.2024.1521181DOI Listing

Publication Analysis

Top Keywords

microbial genome
8
genome analysis
8
single amplified
8
average nucleotide
8
nucleotide identity
8
dna2bit
6
dna2bit high
4
high performance
4
performance genomic
4
genomic distance
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!