Unlabelled: Thousands of complete genome sequences for strains of a species that are now available enable the advancement of pangenome analytics to a new level of sophistication. We collected 2,377 publicly available complete genomes of for detailed pangenome analysis. The core genome and accessory genomes consisted of 2,398 and 5,182 genes, respectively. We developed a machine learning approach to define the accessory genes characterizing the major phylogroups of plus : A, B1, B2, C, D, E, F, G, and . The analysis resulted in a detailed structure of the genetic basis of the phylogroups' differential traits. This pangenome structure was largely consistent with a housekeeping-gene-based MLST distribution, sequence-based Mash distance, and the Clermont quadruplex classification. The rare genome (consisting of genes found in <6.8% of all strains) consisted of 163,619 genes, about 79% of which represented variations of 315 underlying transposon elements. This analysis generated a mathematical definition of the genetic basis for a species.

Importance: The comprehensive analysis of the pangenome of presented in this study marks a significant advancement in understanding bacterial genetic diversity. By employing machine learning techniques to analyze 2,377 complete genomes, the study provides a detailed mapping of core, accessory, and rare genes. This approach reveals the genetic basis for differential traits across phylogroups, offering insights into pathogenicity, antibiotic resistance, and evolutionary adaptations. The findings enhance the potential for genome-based diagnostics and pave the way for future studies aimed at achieving a global genetic definition of bacterial phylogeny.

Download full-text PDF

Source
http://dx.doi.org/10.1128/msphere.00532-24DOI Listing

Publication Analysis

Top Keywords

decomposition pangenome
4
pangenome matrix
4
matrix reveals
4
reveals structure
4
structure gene
4
gene distribution
4
distribution species
4
species unlabelled
4
unlabelled thousands
4
thousands complete
4

Similar Publications

Introduction: Accurate genotyping of Killer cell Immunoglobulin-like Receptor (KIR) genes plays a pivotal role in enhancing our understanding of innate immune responses, disease correlations, and the advancement of personalized medicine. However, due to the high variability of the KIR region and high level of sequence similarity among different KIR genes, the generic genotyping workflows are unable to accurately infer copy numbers and complete genotypes of individual KIR genes from next-generation sequencing data. Thus, specialized genotyping tools are needed to genotype this complex region.

View Article and Find Full Text PDF

Unlabelled: Thousands of complete genome sequences for strains of a species that are now available enable the advancement of pangenome analytics to a new level of sophistication. We collected 2,377 publicly available complete genomes of for detailed pangenome analysis. The core genome and accessory genomes consisted of 2,398 and 5,182 genes, respectively.

View Article and Find Full Text PDF

Comparative genomic analysis of an emerging Pseudomonadaceae member, .

Microbiol Spectr

August 2024

Department of Microbiology, University of Tennessee, Knoxville, Tennessee, USA.

, an organism recently classified within the Pseudomonadaceae family, has been detected in diverse sources such as human tissues, animal guts, industrial fermenters, and decomposition environments, suggesting a diverse ecological role. However, a large knowledge gap exists in how functions. In this comparative genomic analysis, adaptations indicative of habitat specificity among strains and genomic similarity to known opportunistic pathogens are revealed.

View Article and Find Full Text PDF

Accurate genotyping of Killer cell Immunoglobulin-like Receptor (KIR) genes plays a pivotal role in enhancing our understanding of innate immune responses, disease correlations, and the advancement of personalized medicine. However, due to the high variability of the KIR region and high level of sequence similarity among different KIR genes, the currently available genotyping methods are unable to accurately infer copy numbers, genotypes and haplotypes of individual KIR genes from next-generation sequencing data. Here we introduce Geny, a new computational tool for precise genotyping of KIR genes.

View Article and Find Full Text PDF

Salt marshes are known for their significant carbon storage capacity, and sulfur cycling is closely linked with the ecosystem-scale carbon cycling in these ecosystems. Sulfate reducers are key for the decomposition of organic matter, and sulfur oxidizers remove toxic sulfide, supporting the productivity of marsh plants. To date, the complexity of coastal environments, heterogeneity of the rhizosphere, high microbial diversity, and uncultured majority hindered our understanding of the genomic diversity of sulfur-cycling microbes in salt marshes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!