Environmental DNA (eDNA) is becoming an increasingly important tool in diverse scientific fields from ecological biomonitoring to wastewater surveillance of viruses. The fundamental challenge in eDNA analyses has been the bioinformatical assignment of reads to taxonomic groups. It has long been known that full probabilistic methods for phylogenetic assignment are preferable, but unfortunately, such methods are computationally intensive and are typically inapplicable to modern next-generation sequencing data.
View Article and Find Full Text PDFWastewater-based surveillance (WBS) is an important epidemiological and public health tool for tracking pathogens across the scale of a building, neighbourhood, city, or region. WBS gained widespread adoption globally during the SARS-CoV-2 pandemic for estimating community infection levels by qPCR. Sequencing pathogen genes or genomes from wastewater adds information about pathogen genetic diversity, which can be used to identify viral lineages (including variants of concern) that are circulating in a local population.
View Article and Find Full Text PDFWastewater surveillance has become essential for monitoring the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The quantification of SARS-CoV-2 RNA in wastewater correlates with the coronavirus disease 2019 (COVID-19) caseload in a community. However, estimating the proportions of different SARS-CoV-2 haplotypes has remained technically difficult.
View Article and Find Full Text PDFMotivation: Clustering is a fundamental task in the analysis of nucleotide sequences. Despite the exponential increase in the size of sequence databases of homologous genes, few methods exist to cluster divergent sequences. Traditional clustering methods have mostly focused on optimizing high speed clustering of highly similar sequences.
View Article and Find Full Text PDFHuman severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is most closely related, by average genetic distance, to two coronaviruses isolated from bats, RaTG13 and RmYN02. However, there is a segment of high amino acid similarity between human SARS-CoV-2 and a pangolin-isolated strain, GD410721, in the receptor-binding domain (RBD) of the spike protein, a pattern that can be caused by either recombination or by convergent amino acid evolution driven by natural selection. We perform a detailed analysis of the synonymous divergence, which is less likely to be affected by selection than amino acid divergence, between human SARS-CoV-2 and related strains.
View Article and Find Full Text PDFThe rooting of the SARS-CoV-2 phylogeny is important for understanding the origin and early spread of the virus. Previously published phylogenies have used different rootings that do not always provide consistent results. We investigate several different strategies for rooting the SARS-CoV-2 tree and provide measures of statistical uncertainty for all methods.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
October 2018
Animal domestication efforts have led to a shared spectrum of striking behavioral and morphological changes. To recapitulate this process, silver foxes have been selectively bred for tame and aggressive behaviors for more than 50 generations at the Institute for Cytology and Genetics in Novosibirsk, Russia. To understand the genetic basis and molecular mechanisms underlying the phenotypic changes, we profiled gene expression levels and coding SNP allele frequencies in two brain tissue specimens from 12 aggressive foxes and 12 tame foxes.
View Article and Find Full Text PDFIn contrast to infections with human immunodeficiency virus (HIV) in humans and simian immunodeficiency virus (SIV) in macaques, SIV infection of a natural host, sooty mangabeys (Cercocebus atys), is non-pathogenic despite high viraemia. Here we sequenced and assembled the genome of a captive sooty mangabey. We conducted genome-wide comparative analyses of transcript assemblies from C.
View Article and Find Full Text PDFConcurr Comput
September 2014
A variety of extremely challenging biological sequence analyses were conducted on the XSEDE large shared memory resource Blacklight, using current bioinformatics tools and encompassing a wide range of scientific applications. These include genomic sequence assembly, very large metagenomic sequence assembly, transcriptome assembly, and sequencing error correction. The data sets used in these analyses included uncategorized fungal species, reference microbial data, very large soil and human gut microbiome sequence data, and primate transcriptomes, composed of both short-read and long-read sequence data.
View Article and Find Full Text PDFRNA-based next-generation sequencing (RNA-Seq) provides a tremendous amount of new information regarding gene and transcript structure, expression and regulation. This is particularly true for non-coding RNAs where whole transcriptome analyses have revealed that the much of the genome is transcribed and that many non-coding transcripts have widespread functionality. However, uniform resources for raw, cleaned and processed RNA-Seq data are sparse for most organisms and this is especially true for non-human primates (NHPs).
View Article and Find Full Text PDFThis work presents a new approach to the analysis of aperiodic pulsatile heteroscedastic time-series data, specifically hormone pulsatility. We have utilized growth hormone (GH) concentration time-series data as an example for the utilization of this new algorithm. While many previously published approaches used for the analysis of GH pulsatility are both subjective and cumbersome to use, AutoDecon is a nonsubjective, standardized, and completely automated algorithm.
View Article and Find Full Text PDFHormone signaling is often pulsatile, and multiparameter deconvolution procedures have long been used to identify and characterize secretory events. However, the existing programs have serious limitations, including the subjective nature of initial peak selection, lack of statistical verification of presumed bursts, and user-unfriendliness of the application. Here we describe a novel deconvolution program, AutoDecon, which addresses these concerns.
View Article and Find Full Text PDF