AI Article Synopsis

  • Genomic analyses of pathogenic fungi often lack accuracy controls, making it vital to establish reliable methods.
  • A comparison of 14 variant calling pipelines showed high agreement in SNP detection across different fungal isolates, though major differences emerged in read trimming strategies and calling methods.
  • The research produced two truth datasets to enhance future benchmarking of variant calling practices, enabling more consistent results in tracking fungal outbreaks globally.

Article Abstract

Genomic analyses are widely applied to epidemiological, population genetic and experimental studies of pathogenic fungi. A wide range of methods are employed to carry out these analyses, typically without including controls that gauge the accuracy of variant prediction. The importance of tracking outbreaks at a global scale has raised the urgency of establishing high-accuracy pipelines that generate consistent results between research groups. To evaluate currently employed methods for whole-genome variant detection and elaborate best practices for fungal pathogens, we compared how 14 independent variant calling pipelines performed across 35 isolates from 4 distinct clades and evaluated the performance of variant calling, single-nucleotide polymorphism (SNP) counts and phylogenetic inference results. Although these pipelines used different variant callers and filtering criteria, we found high overall agreement of SNPs from each pipeline. This concordance correlated with site quality, as SNPs discovered by a few pipelines tended to show lower mapping quality scores and depth of coverage than those recovered by all pipelines. We observed that the major differences between pipelines were due to variation in read trimming strategies, SNP calling methods and parameters, and downstream filtration criteria. We calculated specificity and sensitivity for each pipeline by aligning three isolates with chromosomal level assemblies and found that the GATK-based pipelines were well balanced between these metrics. Selection of trimming methods had a greater impact on SAMtools-based pipelines than those using GATK. Phylogenetic trees inferred by each pipeline showed high consistency at the clade level, but there was more variability between isolates from a single outbreak, with pipelines that used more stringent cutoffs having lower resolution. This project generated two truth datasets useful for routine benchmarking of variant calling, a consensus VCF of genotypes discovered by 10 or more pipelines across these 35 diverse isolates and variants for 2 samples identified from whole-genome alignments. This study provides a foundation for evaluating SNP calling pipelines and developing best practices for future fungal genomic studies.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10210944PMC
http://dx.doi.org/10.1099/mgen.0.000979DOI Listing

Publication Analysis

Top Keywords

variant calling
12
pipelines
11
best practices
8
calling pipelines
8
discovered pipelines
8
snp calling
8
variant
7
calling
5
comparing genomic
4
genomic variant
4

Similar Publications

The increasingly widespread application of next-generation sequencing (NGS) in clinical diagnostics and epidemiological research has generated a demand for robust, fast, automated, and user-friendly bioinformatics workflows. To guide the choice of tools for the assembly of full-length viral genomes from NGS datasets, we assessed the performance and applicability of four open-source bioinformatics pipelines (shiver-for which we created a user-friendly Dockerized version, referred to as dshiver; SmaltAlign; viral-ngs; and V-pipe) using both simulated and real-world HIV-1 paired-end short-read datasets and default settings. All four pipelines produced consensus genome assemblies with high quality metrics (genome fraction recovery, mismatch and indel rates, variant calling F1 scores) when the reference sequence used for assembly had high similarity to the analyzed sample.

View Article and Find Full Text PDF

: is a globally emerging pathogen with widespread antimicrobial resistance driven by multiple mechanisms, such as altered expression of efflux pumps like AdeABC, placing it as a priority for research. Driven by the lack of new treatments, alternative approaches are being explored to combat its infections, among which efficacy-enhancing adjuvants can be found. This study presents and characterizes MV6, a synthetic cyclic peptide that boosts aminoglycoside efficacy.

View Article and Find Full Text PDF

Somatic mutations in individual cells lead to genomic mosaicism, contributing to the intricate regulatory landscape of genetic disorders and cancers. To evaluate and refine the detection of somatic mosaicism across different technologies with personalized donor-specific assembly (DSA), we obtained tissue from the dorsolateral prefrontal cortex (DLPFC) of a post-mortem neurotypical 31-year-old individual. We sequenced bulk DLPFC tissue using Oxford Nanopore Technologies (∼60X), NovaSeq (∼30X), and linked-read sequencing (∼28X).

View Article and Find Full Text PDF

Background: Variants in the mitochondrial genome (mtDNA) cause a diverse collection of mitochondrial diseases and have extensive phenotypic overlap with Mendelian diseases encoded on the nuclear genome. The mtDNA is often not specifically evaluated in patients with suspected Mendelian disease, resulting in overlooked diagnostic variants.

Methods: Using dedicated pipelines to address the technical challenges posed by the mtDNA - circular genome, variant heteroplasmy, and nuclear misalignment - single nucleotide variants, small indels, and large mtDNA deletions were called from exome and genome sequencing data, in addition to RNA-sequencing when available.

View Article and Find Full Text PDF

In studies of individuals of primarily European genetic ancestry, common and low-frequency variants and rare coding variants have been found to be associated with the risk of bipolar disorder (BD) and schizophrenia (SZ). However, less is known for individuals of other genetic ancestries or the role of rare non-coding variants in BD and SZ risk. We performed whole genome sequencing of African American individuals: 1,598 with BD, 3,295 with SZ, and 2,651 unaffected controls (InPSYght study).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!