AI Article Synopsis

  • The updated EzBioCloud platform enhances the classification and identification of Bacteria and Archaea using advanced DNA sequencing technologies and a comprehensive taxonomic hierarchy based on 16S rRNA gene and genome sequences.
  • The new version includes a refined classification system that addresses underrepresented lineages and ensures high-quality data by implementing strict screening protocols for whole-genome assemblies from the NCBI Assembly Database.
  • EzBioCloud now features a total of 109,835 species, integrating methods like average nucleotide identity (ANI) and core gene ANI (cgANI) to improve taxonomic placements and accuracy.

Article Abstract

With the continued evolution of DNA sequencing technologies, the role of genome sequence data has become more integral in the classification and identification of Bacteria and Archaea. Six years after introducing EzBioCloud, an integrated platform representing the taxonomic hierarchy of Bacteria and Archaea through quality-controlled 16S rRNA gene and genome sequences, we present an updated version, that further refines and expands its capabilities. The current update recognizes the growing need for accurate taxonomic information as defining a species increasingly relies on genome sequence comparisons. We also incorporated an advanced strategy for addressing underrepresented or less studied lineages, bolstering the comprehensiveness and accuracy of our database. Our rigorous quality control protocols remain, where whole-genome assemblies from the NCBI Assembly Database undergo stringent screening to remove low-quality sequence data. These are then passed through our enhanced identification bioinformatics pipeline which initiates a 16S rRNA gene similarity search and then calculates the average nucleotide identity (ANI). For genome sequences lacking a 16S rRNA sequence and without a closely related genomic representative for ANI calculation, we apply a different ANI approach using bacterial core genes for improved taxonomic placement (core gene ANI, cgANI). Because of the increase in genome sequences available in NCBI and our newly introduced cgANI method, EzBioCloud now encompasses a total of 109 835 species, of which 21 964 have validly published names. 47 896 are candidate species identified either through 16S rRNA sequence similarity (phylotypes) or through whole genome ANI (genomospecies), and the remaining 39 975 were positioned in the taxonomic tree by cgANI (species clusters). Our EzBioCloud database is accessible at www.ezbiocloud.net/db.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11261700PMC
http://dx.doi.org/10.1099/ijsem.0.006421DOI Listing

Publication Analysis

Top Keywords

16s rrna
16
genome sequences
12
genome sequence
8
sequence data
8
bacteria archaea
8
rrna gene
8
rrna sequence
8
genome
6
sequence
5
ani
5

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!