Genomic sequencing of clinical samples to identify emerging variants of SARS-CoV-2 has been a key public health tool for curbing the spread of the virus. As a result, an unprecedented number of SARS-CoV-2 genomes were sequenced during the COVID-19 pandemic, which allowed for rapid identification of genetic variants, enabling the timely design and testing of therapies and deployment of new vaccine formulations to combat the new variants. However, despite the technological advances of deep sequencing, the analysis of the raw sequence data generated globally is neither standardized nor consistent, leading to vastly disparate sequences that may impact identification of variants.
View Article and Find Full Text PDFDuring the COVID-19 pandemic, SARS-CoV-2 surveillance efforts integrated genome sequencing of clinical samples to identify emergent viral variants and to support rapid experimental examination of genome-informed vaccine and therapeutic designs. Given the broad range of methods applied to generate new viral genomes, it is critical that consensus and variant calling tools yield consistent results across disparate pipelines. Here we examine the impact of sequencing technologies (Illumina and Oxford Nanopore) and 7 different downstream bioinformatic protocols on SARS-CoV-2 variant calling as part of the NIH Accelerating COVID-19 Therapeutic Interventions and Vaccines (ACTIV) Tracking Resistance and Coronavirus Evolution (TRACE) initiative, a public-private partnership established to address the COVID-19 outbreak.
View Article and Find Full Text PDFDespite unprecedented global sequencing and surveillance of SARS-CoV-2, timely identification of the emergence and spread of novel variants of concern (VoCs) remains a challenge. Several million raw genome sequencing runs are now publicly available. We sought to survey these datasets for intrahost variation to study emerging mutations of concern.
View Article and Find Full Text PDFBackground: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of coronavirus disease 2019 (COVID-19), has spread globally and is being surveilled with an international genome sequencing effort. Surveillance consists of sample acquisition, library preparation, and whole genome sequencing. This has necessitated a classification scheme detailing Variants of Concern (VOC) and Variants of Interest (VOI), and the rapid expansion of bioinformatics tools for sequence analysis.
View Article and Find Full Text PDFDespite unprecedented global sequencing and surveillance of SARS-CoV-2, timely identification of the emergence and spread of novel variants of concern (VoCs) remains a challenge. Several million raw genome sequencing runs are now publicly available. We sought to survey these datasets for intrahost variation to study emerging mutations of concern.
View Article and Find Full Text PDFSummary: Genomics has become an essential technology for surveilling emerging infectious disease outbreaks. A range of technologies and strategies for pathogen genome enrichment and sequencing are being used by laboratories worldwide, together with different and sometimes ad hoc, analytical procedures for generating genome sequences. A fully integrated analytical process for raw sequence to consensus genome determination, suited to outbreaks such as the ongoing COVID-19 pandemic, is critical to provide a solid genomic basis for epidemiological analyses and well-informed decision making.
View Article and Find Full Text PDFTo date, the potential impact of viral communities on biogeochemical cycles in soil has largely been inferred from correlational evidence, such as virus-driven changes in microbial abundances, viral auxiliary metabolic genes, and links with soil physiochemical properties. To more directly test the impact of soil viruses on carbon cycling during plant litter decomposition, we added concentrated viral community suspensions to complex litter decomposer communities in 40-day microcosm experiments. Microbial communities from two New Mexico alpine soils, Pajarito (PJ) and Santa Fe (SF), were inoculated onto grass litter on sand, and three treatments were applied in triplicate to each set of microcosms: addition of buffer (no added virus), live virus (+virus), or killed-virus (+killed-virus) fractions extracted from the same soil.
View Article and Find Full Text PDFNon-typhoidal Salmonella (NTS) is a major global health concern that often causes bloodstream infections in areas of the world affected by malnutrition and comorbidities such as HIV and malaria. Developing a strategy to control the emergence and spread of highly invasive and antimicrobial resistant NTS isolates requires a comprehensive analysis of epidemiological factors and molecular pathogenesis. Here, we characterize 11 NTS isolates that caused bloodstream infections in pediatric patients in Siaya, Kenya from 2003-2010.
View Article and Find Full Text PDFViruses represent important test cases for data federation due to their genome size and the rapid increase in sequence data in publicly available databases. However, some consequences of previously decentralized (unfederated) data are lack of consensus or comparisons between feature annotations. Unifying or displaying alternative annotations should be a priority both for communities with robust entry representation and for nascent communities with burgeoning data sources.
View Article and Find Full Text PDFSummary: Polymerase chain reaction-based assays are the current gold standard for detecting and diagnosing SARS-CoV-2. However, as SARS-CoV-2 mutates, we need to constantly assess whether existing PCR-based assays will continue to detect all known viral strains. To enable the continuous monitoring of SARS-CoV-2 assays, we have developed a web-based assay validation algorithm that checks existing PCR-based assays against the ever-expanding genome databases for SARS-CoV-2 using both thermodynamic and edit-distance metrics.
View Article and Find Full Text PDFPersister cells are genetically identical variants in a bacterial population that have phenotypically modified their physiology to survive environmental stress. In bacterial pathogens, persisters are able to survive antibiotic treatment and reinfect patients in a frustrating cycle of chronic infection. To better define core persistence mechanisms for therapeutics development, we performed transcriptomics analyses of populations enriched for persisters via three methods: flow sorting for low proton motive force, meropenem treatment, and culture aging.
View Article and Find Full Text PDFThere is growing interest in reconstructing phylogenies from the copious amounts of genome sequencing projects that target related viral, bacterial or eukaryotic organisms. To facilitate the construction of standardized and robust phylogenies for disparate types of projects, we have developed a complete bioinformatic workflow, with a web-based component to perform phylogenetic and molecular evolutionary (PhaME) analysis from sequencing reads, draft assemblies or completed genomes of closely related organisms. Furthermore, the ability to incorporate raw data, including some metagenomic samples containing a target organism (e.
View Article and Find Full Text PDFSequencing-based analyses of microbiomes have traditionally focused on addressing the question of community membership and profiling taxonomic abundance through amplicon sequencing of 16 rRNA genes. More recently, shotgun metagenomics, which involves the random sequencing of all genomic content of a microbiome, has dominated this arena due to advancements in sequencing technology throughput and capability to profile genes as well as microbiome membership. While these methods have revealed a great number of insights into a wide variety of microbiomes, both of these approaches only describe the presence of organisms or genes, and not whether they are active members of the microbiome.
View Article and Find Full Text PDFMany of the sequenced bacterial and archaeal genomes encode regions of viral provenance. Yet, not all of these regions encode bona fide viruses. Gene transfer agents (GTAs) are thought to be former viruses that are now maintained in genomes of some bacteria and archaea and are hypothesized to enable exchange of DNA within bacterial populations.
View Article and Find Full Text PDFA wealth of viral data sits untapped in publicly available metagenomic data sets when it might be extracted to create a usable index for the virological research community. We hypothesized that work of this complexity and scale could be done in a hackathon setting. Ten teams comprised of over 40 participants from six countries, assembled to create a crowd-sourced set of analysis and processing pipelines for a complex biological data set in a three-day event on the San Diego State University campus starting 9 January 2019.
View Article and Find Full Text PDFAppl Environ Microbiol
February 2019
Several bacterial and archaeal lineages produce nanostructures that morphologically resemble small tailed viruses, but, unlike most viruses, contain apparently random pieces of the host genome. Since these elements can deliver the packaged DNA to other cells, they were dubbed gene transfer agents (GTAs). Because many genes involved in GTA production have viral homologs, it has been hypothesized that the GTA ancestor was a virus.
View Article and Find Full Text PDFNext-generation sequencing (NGS) offers unparalleled resolution for untargeted organism detection and characterization. However, the majority of NGS analysis programs require users to be proficient in programming and command-line interfaces. EDGE bioinformatics was developed to offer scientists with little to no bioinformatics expertise a point-and-click platform for analyzing sequencing data in a rapid and reproducible manner.
View Article and Find Full Text PDFGene transfer agents (GTAs) are phage-like particles that can package and transfer a random piece of the producing cell's genome, but are unable to transfer all the genes required for their own production. As such, GTAs represent an evolutionary conundrum: are they selfish genetic elements propagating through an unknown mechanism, defective viruses, or viral structures "repurposed" by cells for gene exchange, as their name implies? In Rhodobacter capsulatus, production of the R. capsulatus GTA (RcGTA) particles is associated with a cluster of genes resembling a small prophage.
View Article and Find Full Text PDFBackground: In the last 5 years, the rapid pace of innovations and improvements in sequencing technologies has completely changed the landscape of metagenomic and metagenetic experiments. Therefore, it is critical to benchmark the various methodologies for interrogating the composition of microbial communities, so that we can assess their strengths and limitations. The most common phylogenetic marker for microbial community diversity studies is the 16S ribosomal RNA gene and in the last 10 years the field has moved from sequencing a small number of amplicons and samples to more complex studies where thousands of samples and multiple different gene regions are interrogated.
View Article and Find Full Text PDFDespite over three decades of progress, extraction of high molecular weight (HMW) DNA from high clay soils or iron oxide cemented clay has remained challenging. HMW DNA is desirable for next generation sequencing as it yields the most comprehensive coverage. Several DNA extraction procedures were compared from samples that exhibit strong nucleic acid adsorption.
View Article and Find Full Text PDFBacterial and fungal communities associated with plant roots are central to the host health, survival and growth. However, a robust understanding of the root-microbiome and the factors that drive host associated microbial community structure have remained elusive, especially in mature perennial plants from natural settings. Here, we investigated relationships of bacterial and fungal communities in the rhizosphere and root endosphere of the riparian tree species Populus deltoides, and the influence of soil parameters, environmental properties (host phenotype and aboveground environmental settings), host plant genotype (Simple Sequence Repeat (SSR) markers), season (Spring vs.
View Article and Find Full Text PDFNext-generation sequencing has dramatically changed the landscape of microbial ecology, large-scale and in-depth diversity studies being now widely accessible. However, determining the accuracy of taxonomic and quantitative inferences and comparing results obtained with different approaches are complicated by incongruence of experimental and computational data types and also by lack of knowledge of the true ecological diversity. Here we used highly diverse bacterial and archaeal synthetic communities assembled from pure genomic DNAs to compare inferences from metagenomic and SSU rRNA amplicon sequencing.
View Article and Find Full Text PDF