Routine and systematic use of bacterial whole-genome sequencing (WGS) is enhancing the accuracy and resolution of epidemiological investigations carried out by Public Health laboratories and regulatory agencies. Large volumes of publicly available WGS data can be used to study pathogenic populations at a large scale. Recently, a freely available computational platform called ProkEvo was published to enable reproducible, automated, and scalable hierarchical-based population genomic analyses using bacterial WGS data. This implementation of ProkEvo demonstrated the importance of combining standard genotypic mapping of populations with mining of accessory genomic content for ecological inference. In particular, the work highlighted here used ProkEvo-derived outputs for population-scaled hierarchical analyses using the R programming language. The main objective was to provide a practical guide for microbiologists, ecologists, and epidemiologists by showing how to: i) use a phylogeny-guided mapping of hierarchical genotypes; ii) assess frequency distributions of genotypes as a proxy for ecological fitness; iii) determine kinship relationships and genetic diversity using specific genotypic classifications; and iv) map lineage differentiating accessory loci. To enhance reproducibility and portability, R markdown files were used to demonstrate the entire analytical approach. The example dataset contained genomic data from 2,365 isolates of the zoonotic foodborne pathogen Salmonella Newport. Phylogeny-anchored mapping of hierarchical genotypes (Serovar -> BAPS1 -> ST -> cgMLST) revealed the population genetic structure, highlighting sequence types (STs) as the keystone differentiating genotype. Across the three most dominant lineages, ST5 and ST118 shared a common ancestor more recently than with the highly clonal ST45 phylotype. ST-based differences were further highlighted by the distribution of accessory antimicrobial resistance (AMR) loci. Lastly, a phylogeny-anchored visualization was used to combine hierarchical genotypes and AMR content to reveal the kinship structure and lineage-specific genomic signatures. Combined, this analytical approach provides some guidelines for conducting heuristic bacterial population genomic analyses using pan-genomic information.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.3791/63115 | DOI Listing |
Plants (Basel)
December 2024
Department of Agronomy, Faculty of Agriculture, Sher-e-Bangla Agricultural University, Dhaka 1207, Bangladesh.
Drought is a detrimental abiotic stress that severely limits wheat growth and productivity worldwide by altering several physiological processes. Thus, understanding the mechanisms of drought tolerance is essential for the selection of drought-resilient features and drought-tolerant cultivars for wheat breeding programs. This exploratory study evaluated 14 wheat genotypes (13 relatively tolerant, one susceptible) for drought endurance based on flag leaf physiological and biochemical traits during the critical grain-filling stage in the field conditions.
View Article and Find Full Text PDFHGG Adv
January 2025
Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
Inherited genetics represents an important contributor to risk of esophageal adenocarcinoma (EAC), and its precursor Barrett's esophagus (BE). Genome-wide association studies have identified ∼30 susceptibility variants for BE/EAC, yet genetic interactions remain unexamined. To address challenges in large-scale G×G scans, we combined knowledge-guided filtering and machine learning approaches, focusing on genes with (A) known/plausible links to BE/EAC pathogenesis (n=493) or (B) prior evidence of biological interactions (n=4,196).
View Article and Find Full Text PDFJ Clin Microbiol
December 2024
National Cancer Institute, Rockville, Maryland, USA.
Front Plant Sci
December 2024
Department of Electrical and Computer Engineering, Iowa State University, Ames, IA, United States.
In plant breeding and genetics, predictive models traditionally rely on compact representations of high-dimensional data, often using methods like Principal Component Analysis (PCA) and, more recently, Autoencoders (AE). However, these methods do not separate genotype-specific and environment-specific features, limiting their ability to accurately predict traits influenced by both genetic and environmental factors. We hypothesize that disentangling these representations into genotype-specific and environment-specific components can enhance predictive models.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!