Motivation: The k-mer frequency in whole genome sequences provides researchers with an insightful perspective on genomic complexity, comparative genomics, metagenomics and phylogeny. The current k-mer counting tools are typically slow, and they require large memory and hard disk for assembled genome analysis.
Results: We propose a novel and ultra-fast k-mer counting algorithm, KCOSS, to fulfill k-mer counting mainly for assembled genomes with segmented Bloom filter, lock-free queue, lock-free thread pool and cuckoo hash table. We optimize running time and memory consumption by recycling memory blocks, merging multiple consecutive first-occurrence k-mers into C-read, and writing a set of C-reads to disk asynchronously. KCOSS was comparatively tested with Jellyfish2, CHTKC and KMC3 on seven assembled genomes and three sequencing datasets in running time, memory consumption, and hard disk occupation. The experimental results show that KCOSS counts k-mer with less memory and disk while having a shorter running time on assembled genomes. KCOSS can be used to calculate the k-mer frequency not only for assembled genomes but also for sequencing data.
Availabilityand Implementation: The KCOSS software is implemented in C++. It is freely available on GitHub: https://github.com/kcoss-2021/KCOSS.
Supplementary Information: Supplementary data are available at Bioinformatics online.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bioinformatics/btab797 | DOI Listing |
BMC Biol
January 2025
Faculty of Sciences, Centre for Environmental Sciences, Research Group Zoology: Biodiversity and Toxicology, UHasselt - Hasselt University, Diepenbeek, Belgium.
Background: Stress responses are key the survival of parasites and, consequently, also the evolutionary success of these organisms. Despite this importance, our understanding of the evolution of molecular pathways dealing with environmental stressors in parasitic animals remains limited. Here, we tested the link between adaptive evolution of parasite stress response genes and their ecological diversity and species richness.
View Article and Find Full Text PDFVirology
January 2025
Department of Poultry Diseases, Faculty of Veterinary Medicine, University of Warmia and Mazury in Olsztyn, Poland. Electronic address:
Adenoviruses are a concern for pigeon breeders due to their impact on animal health. Furthermore, they have been studied for nearly five decades and are one of the most studied viruses in pigeons. However, the number of complete genomic sequences of pigeon-infecting adenoviruses available is very low, and the pathogenic effect of these viruses on pigeons is still yet to be thoroughly explored.
View Article and Find Full Text PDFInt J Biol Macromol
January 2025
State Key Laboratory of North China Crop Improvement and Regulation, Hebei Agricultural University, Baoding 071000, China; Key Laboratory of Vegetable Germplasm Innovation and Utilization of Hebei, Ministry of Education of China-Hebei Province Joint Innovation Center for Efficient Green Vegetable Industry, College of Horticulture, Hebei Agricultural University, Baoding 071000, China; Division of Plant Sciences, Research School of Biology, Australian National University, Canberra, ACT 2601, Australia. Electronic address:
Fusarium oxysporum f. sp. lycopersici (Fol), the causal agent of tomato wilt disease, is a soil-borne, vascular-colonizing fungal pathogen that severely impacts tomato production in most growing regions worldwide.
View Article and Find Full Text PDFSci Data
January 2025
Key Laboratory of Ecological Safety and Sustainable Development in Arid Lands, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi, 830011, China.
Argali stands as the largest species among wild sheep in Central and East Asia, with a concerning rate of decline estimated at 30%. The intraspecific taxonomy of argali remains contentious due to limited genomic data and unclear geographic separation. In this study, we constructed a chromosome-level genome assembly and annotation for the Tibetan argali (O.
View Article and Find Full Text PDFSci Data
January 2025
Laboratory of Aquatic Genomics, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, 518057, China.
Three-spotted seahorse (Hippocampi trimaculata) is a unique fish with important economic and medicinal values, and its total chromosome number is potentially quite different from other seahorse species. Herein, we constructed a chromosome-level genome assembly for this special seahorse by integration of MGI short-read, PacBio HiFi long-read and Hi-C sequencing techniques. A 416.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!