Legionnaires' disease, a severe lung infection caused by the bacterium , occurs as single cases or in outbreaks that are actively tracked by public health departments. To determine the point source of an outbreak, clinical isolates need to be compared to environmental samples to find matching isolates. One confounding factor is the genome plasticity of , making an exact sequence comparison by whole-genome sequencing (WGS) challenging. Here, we present a WGS analysis pipeline, LegioCluster, that is designed to circumvent this problem by automatically selecting the best matching reference genome prior to mapping and variant calling. This approach reduces the number of false-positive variant calls, maximizes the fraction of all genomes that are being compared, and naturally clusters the isolates according to their reference strain. Isolates that are too distant from any genome in the database are added to the list of candidate references, thereby creating a new cluster. Short insertions or deletions are considered in addition to single-nucleotide polymorphisms for increased discriminatory power. This manuscript describes the use of this automated and "locked down" bioinformatic pipeline deployed at the New York State Department of Health's Wadsworth Center for investigating relatedness between clinical and environmental isolates. A similar pipeline has not been widely available for use to support these critically important public health investigations.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8111141 | PMC |
http://dx.doi.org/10.1128/JCM.00967-20 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!