CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning.

Nat Methods

Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology, Translational Research Institute, Woolloongabba, Queensland, Australia.

Published: August 2023

Advances in sequencing technologies and bioinformatics tools have dramatically increased the recovery rate of microbial genomes from metagenomic data. Assessing the quality of metagenome-assembled genomes (MAGs) is a critical step before downstream analysis. Here, we present CheckM2, an improved method of predicting genome quality of MAGs using machine learning. Using synthetic and experimental data, we demonstrate that CheckM2 outperforms existing tools in both accuracy and computational speed. In addition, CheckM2's database can be rapidly updated with new high-quality reference genomes, including taxa represented only by a single genome. We also show that CheckM2 accurately predicts genome quality for MAGs from novel lineages, even for those with reduced genome size (for example, Patescibacteria and the DPANN superphylum). CheckM2 provides accurate genome quality predictions across bacterial and archaeal lineages, giving increased confidence when inferring biological conclusions from MAGs.

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41592-023-01940-wDOI Listing

Publication Analysis

Top Keywords

genome quality
16
machine learning
8
quality mags
8
genome
6
checkm2
5
quality
5
checkm2 rapid
4
rapid scalable
4
scalable accurate
4
accurate tool
4

Similar Publications

Development of a mitochondrial mini-barcode and its application in metabarcoding for identification of leech in traditional Chinese medicine.

Sci Rep

January 2025

National Key Laboratory of Lead Druggability Research, Shanghai Institute of Pharmaceutical Industry, State Institute of Pharmaceutical Industry, 201203, Shanghai, People's Republic of China.

In Traditional Chinese Medicine (TCM), the medicinal leech is vital for treatments to promote blood circulation and eliminate blood stasis. However, the prevalence of counterfeit leech products in the market undermines the quality and efficacy of these remedies. Traditional DNA barcoding techniques, such as the COI barcode, have been limited in their application due to amplification challenges.

View Article and Find Full Text PDF

Chromosome-level genome assembly, annotation, and population genomic resource of argali (Ovis ammon).

Sci Data

January 2025

Key Laboratory of Ecological Safety and Sustainable Development in Arid Lands, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi, 830011, China.

Argali stands as the largest species among wild sheep in Central and East Asia, with a concerning rate of decline estimated at 30%. The intraspecific taxonomy of argali remains contentious due to limited genomic data and unclear geographic separation. In this study, we constructed a chromosome-level genome assembly and annotation for the Tibetan argali (O.

View Article and Find Full Text PDF

Chromosome-scale genome assembly of three-spotted seahorse (Hippocampus trimaculatus) with a unique karyotype.

Sci Data

January 2025

Laboratory of Aquatic Genomics, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, 518057, China.

Three-spotted seahorse (Hippocampi trimaculata) is a unique fish with important economic and medicinal values, and its total chromosome number is potentially quite different from other seahorse species. Herein, we constructed a chromosome-level genome assembly for this special seahorse by integration of MGI short-read, PacBio HiFi long-read and Hi-C sequencing techniques. A 416.

View Article and Find Full Text PDF

The Southern Ground Hornbill (SGH - Bucorvus leadbeateri) is one of the largest hornbill species worldwide, known for its complex social structures and breeding behaviours. This bird has been of great interest due to its declining population and disappearance from historic ranges in southern Africa. Despite being the focus of numerous conservation efforts, with research forming an integral part of these initiatives, there is still a substantial lack of knowledge regarding the molecular biology aspects of this bird species.

View Article and Find Full Text PDF

Objectives: Explore the presence, or absence, of virulence genes and the phylogeny of a multi-decade UK collection of clinical and reference Fusobacterium necrophorum isolates.

Methods: Three hundred and eighty-five F. necrophorum strains (1982-2019) were recovered from storage (-80°C).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!