Bacterial genomes exhibit significant variation in gene content and sequence identity. Pangenome analyses explore this diversity by classifying genes into core and accessory clusters of orthologous groups (COGs). However, strict sequence identity cutoffs can misclassify divergent alleles as different genes, inflating accessory gene counts. CLARC (Connected Linkage and Alignment Redefinition of COGs) [ https://github.com/IndraGonz/CLARC ] improves pangenome analyses by condensing accessory COGs using functional annotation and linkage information. Through this approach, orthologous groups are consolidated into more practical units of selection. Analyzing 8,000+ genomes, CLARC reduced accessory gene estimates by more than 30% and improved evolutionary predictions based on accessory gene frequencies. By refining COG definitions, CLARC offers critical insights into bacterial evolution, aiding genetic studies across diverse populations.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11702680PMC
http://dx.doi.org/10.1101/2024.12.18.629228DOI Listing

Publication Analysis

Top Keywords

accessory gene
12
sequence identity
8
pangenome analyses
8
orthologous groups
8
accessory
5
linkage-based ortholog
4
ortholog refinement
4
refinement bacterial
4
bacterial pangenomes
4
clarc
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!