Evaluation of Long-Read Sequencing Simulators to Assess Real-World Applications for Food Safety.

Katrina L Counihan Siddhartha Kanrar Shannon Tilman Andrew Gehring

Foods

Eastern Regional Research Center, United States Department of Agriculture, Agricultural Research Service, Wyndmoor, PA 19038, USA.

Published: December 2023

Shiga toxin-producing E. coli (STEC) are major contributors to severe foodborne illnesses in the U.S., with current identification methods taking several days.
Utilizing long-read, whole genome sequencing via NanoSim-H software could speed up pathogen detection processes, but developing these methods can be costly.
Simulations showed that a significant number of reads (at least 2,500 for STEC) are needed for accurate gene identification, suggesting that sample enrichment is necessary for effective long-read sequencing while still offering a faster alternative to traditional methods.

Shiga toxin-producing (STEC) and are routinely responsible for severe foodborne illnesses in the United States. Current identification methods utilized by the U.S. Food Safety Inspection Service require at least four days to identify STEC and six days for . Adoption of long-read, whole genome sequencing for food safety testing could significantly reduce the time needed for identification, but method development costs are high. Therefore, the goal of this project was to use NanoSim-H software to simulate Oxford Nanopore sequencing reads to assess the feasibility of sequencing-based foodborne pathogen detection and guide experimental design. Sequencing reads were simulated for STEC, , and a 1:1 combination of STEC and genomes using NanoSim-H. At least 2500 simulated reads were needed to identify the seven genes of interest targeted in STEC, and at least 500 reads were needed to detect the gene targeted in . Genome coverage of 30x was estimated at 21,521, and 11,802 reads for STEC and , respectively. Approximately 5-6% of reads simulated from both bacteria did not align with their respective reference genomes due to the introduction of errors. For the STEC and 1:1 genome mixture, all genes of interest were detected with 1,000,000 reads, but less than 1x coverage was obtained. The results suggested sample enrichment would be necessary to detect foodborne pathogens with long-read sequencing, but this would still decrease the time needed from current methods. Additionally, simulation data will be useful for reducing the time and expense associated with laboratory experimentation.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10778541	PMC
http://dx.doi.org/10.3390/foods13010016	DOI Listing

Publication Analysis

Top Keywords

food safety

long-read sequencing

time needed

sequencing reads

reads simulated

reads needed

genes interest

stec

reads

sequencing

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!