skDER: microbial genome dereplication approaches for comparative and metagenomic applications.

bioRxiv

Department of Medical Microbiology and Immunology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, USA.

Published: November 2023

AI Article Synopsis

  • skDER is a tool that combines techniques to efficiently estimate Average Nucleotide Identity (ANI) among many microbial genomes, employing a low-memory dereplication method.
  • It features two main approaches: a dynamic algorithm that selects a concise set of representative genomes for tracking strains in microbiome samples, and a greedy method that reduces redundancy for comparative genomics, enabling better analytical efficiency.
  • Additionally, skDER allows users to automatically download genomes of specific species or genera from the Genome Taxonomy Database and provides precomputed representative genomes for commonly studied bacterial groups.

Article Abstract

skDER (https://github.com/raufs/skDER) combines recent advances to efficiently estimate average nucleotide identity (ANI) between thousands of microbial genomes by skani with two low-memory methods for genomic dereplication. The first method implements a dynamic algorithm to determine a concise set of representative genomes. This approach is well-suited for selecting reference genomes to align metagenomic reads onto for tracking strain presence across related microbiome samples. This is because fewer representative genomes should alleviate the concern that reads belonging to the same strain get falsely partitioned across closely related genomes. The other method, which uses a greedy approach, is better suited for use in comparative genomics, where users might be overwhelmed with the high number of genomes available for certain taxa and aim to reduce redundancy and, therefore, computational requirements for downstream analytics. This method selects a larger number of representative genomes to comprehensively sample the pangenome space for the taxon of interest. To further aid usage for comparative genomics studies, skDER also features an option to automatically download genomes classified as a particular species or genus in the Genome Taxonomy Database and we provide precomputed representative genomes for commonly studied bacterial taxa.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10690176PMC
http://dx.doi.org/10.1101/2023.09.27.559801DOI Listing

Publication Analysis

Top Keywords

representative genomes
16
genomes
9
comparative genomics
8
skder microbial
4
microbial genome
4
genome dereplication
4
dereplication approaches
4
approaches comparative
4
comparative metagenomic
4
metagenomic applications
4

Similar Publications

Long-term epidemiological trends in (primary) pediatric central nervous system tumors: a 25-year cohort analysis in Western Mexico.

Childs Nerv Syst

January 2025

Ph.D. Human Genetics Program, Molecular Biology and Genomics Department, Human Genetics Institute "Dr. Enrique Corona-Rivera", University Center of Health Sciences, University of Guadalajara, Guadalajara, Mexico.

Background: Central nervous system tumors (CNSTs) represent a significant oncological challenge in pediatric populations, particularly in developing regions where access to diagnostic and therapeutic resources is limited.

Methods: This research investigates the epidemiology, histological classifications, and survival outcomes of CNST in a cohort of pediatric patients aged 0 to 19 years within a 25-year retrospective study at the Civil Hospital of Guadalajara, Mexico, from 1999 to 2024.

Results: Data was analyzed from 273 patients who met inclusion criteria, revealing a higher incidence in males (51.

View Article and Find Full Text PDF

The Hepatincolaceae (Alphaproteobacteria) are a group of bacteria that inhabit the gut of arthropods and other ecdysozoans, associating extracellularly with microvilli. Previous phylogenetic studies, primarily single-gene analyses, suggested their relationship to the Holosporales, which includes intracellular bacteria in protist hosts. However, the genomics of Hepatincolaceae is still in its early stages.

View Article and Find Full Text PDF

Comparative Analysis of Protist Communities in Oilsands Tailings Using Amplicon Sequencing and Metagenomics.

Environ Microbiol

January 2025

Division of Infectious Diseases, Department of Medicine, and Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada.

The Canadian province of Alberta contains substantial oilsands reservoirs, consisting of bitumen, clay and sand. Extracting oil involves separating bitumen from inorganic particles using hot water and chemical diluents, resulting in liquid tailings waste with ecotoxicologically significant compounds. Ongoing efforts aim to reclaim tailings-affected areas, with protist colonisation serving as one assessment method of reclamation progress.

View Article and Find Full Text PDF

Menin (MEN1) is a well-recognized powerful tumor promoter in acute leukemias (AL) with KMT2A rearrangements (KMT2Ar, also known as MLL) and mutant nucleophosmin 1 (NPM1m) acute myeloid leukemia (AML). MEN1 is essential for sustaining leukemic transformation due to its interaction with wild-type KMT2A and KMT2A fusion proteins, leading to the dysregulation of KMT2A target genes. MEN1 inhibitors (MIs), such as revumenib, ziftomenib, and other active small molecules, represent a promising new class of therapies currently under clinical development.

View Article and Find Full Text PDF

Preclinical Models for Functional Precision Lung Cancer Research.

Cancers (Basel)

December 2024

Division of Hematology/Oncology, Department of Internal Medicine, University of California Davis School of Medicine, University of California Davis Comprehensive Cancer Center, Sacramento, CA 95817, USA.

Patient-centered precision oncology strives to deliver individualized cancer care. In lung cancer, preclinical models and technological innovations have become critical in advancing this approach. Preclinical models enable deeper insights into tumor biology and enhance the selection of appropriate systemic therapies across chemotherapy, targeted therapies, immunotherapies, antibody-drug conjugates, and emerging investigational treatments.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!