Unlabelled: Large-scale studies are essential to answer questions about complex microbial communities that can be extremely dynamic across hosts, environments, and time points. However, managing acquisition, processing, and analysis of large numbers of samples poses many challenges, with cross-contamination being the biggest obstacle. Contamination complicates analysis and results in sample loss, leading to higher costs and constraints on mixed sample type study designs.
View Article and Find Full Text PDFBackground: Infection prevention (IP) measures are designed to mitigate the transmission of pathogens in healthcare. Using large-scale viral genomic and social network analyses, we determined if IP measures used during the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic were adequate in protecting healthcare workers (HCWs) and patients from acquiring SARS-CoV-2.
Methods: We performed retrospective cross-sectional analyses of viral genomics from all available SARS-CoV-2 viral samples collected at UC San Diego Health and social network analysis using the electronic medical record to derive temporospatial overlap of infections among related viromes and supplemented with contact tracing data.
The composition of the human vaginal microbiome has been linked to a variety of medical conditions including yeast infection, bacterial vaginosis, and sexually transmitted infection. The vaginal microbiome is becoming increasingly acknowledged as a key factor in personal health, and it is essential to establish methods to collect and process accurate samples with self-collection techniques to allow large, population-based studies. In this study, we investigate if using AssayAssure Genelock, a nucleic acid preservative, introduces microbial biases in self-collected vaginal samples.
View Article and Find Full Text PDFStudies using 16S rRNA and shotgun metagenomics typically yield different results, usually attributed to PCR amplification biases. We introduce Greengenes2, a reference tree that unifies genomic and 16S rRNA databases in a consistent, integrated resource. By inserting sequences into a whole-genome phylogeny, we show that 16S rRNA and shotgun metagenomic data generated from the same samples agree in principal coordinates space, taxonomy and phenotype effect size when analyzed with the same tree.
View Article and Find Full Text PDFAutism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by heterogeneous cognitive, behavioral and communication impairments. Disruption of the gut-brain axis (GBA) has been implicated in ASD although with limited reproducibility across studies. In this study, we developed a Bayesian differential ranking algorithm to identify ASD-associated molecular and taxa profiles across 10 cross-sectional microbiome datasets and 15 other datasets, including dietary patterns, metabolomics, cytokine profiles and human brain gene expression profiles.
View Article and Find Full Text PDFReplicability is a well-established challenge in microbiome research with a variety of contributing factors at all stages, from sample collection to code execution. Here, we focus on voided urine sample storage conditions for urogenital microbiome analysis. Using urine samples collected from 10 adult females, we investigated the microbiome preservation efficacy of AssayAssure Genelock (Genelock), compared with no preservative, under different temperature conditions.
View Article and Find Full Text PDFMicrobiome data have several specific characteristics (sparsity and compositionality) that introduce challenges in data analysis. The integration of prior information regarding the data structure, such as phylogenetic structure and repeated-measure study designs, into analysis, is an effective approach for revealing robust patterns in microbiome data. Past methods have addressed some but not all of these challenges and features: for example, robust principal-component analysis (RPCA) addresses sparsity and compositionality; compositional tensor factorization (CTF) addresses sparsity, compositionality, and repeated measure study designs; and UniFrac incorporates phylogenetic information.
View Article and Find Full Text PDFThe number of publicly available microbiome samples is continually growing. As data set size increases, bottlenecks arise in standard analytical pipelines. Faith's phylogenetic diversity (Faith's PD) is a highly utilized phylogenetic alpha diversity metric that has thus far failed to effectively scale to trees with millions of vertices.
View Article and Find Full Text PDFStandard workflows for analyzing microbiomes often include the creation and curation of phylogenetic trees. Here we present EMPress, an interactive web tool for visualizing trees in the context of microbiome, metabolome, and other community data scalable to trees with well over 500,000 nodes. EMPress provides novel functionality-including ordination integration and animations-alongside many standard tree visualization features and thus simplifies exploratory analyses of many forms of 'omic data.
View Article and Find Full Text PDFSynergistic effects of bacteria on viral stability and transmission are widely documented but remain unclear in the context of SARS-CoV-2. We collected 972 samples from hospitalized ICU patients with coronavirus disease 2019 (COVID-19), their health care providers, and hospital surfaces before, during, and after admission. We screened for SARS-CoV-2 using RT-qPCR, characterized microbial communities using 16S rRNA gene amplicon sequencing, and contextualized the massive microbial diversity in this dataset in a meta-analysis of over 20,000 samples.
View Article and Find Full Text PDFRapid growth of genome data provides opportunities for updating microbial evolutionary relationships, but this is challenged by the discordant evolution of individual genes. Here we build a reference phylogeny of 10,575 evenly-sampled bacterial and archaeal genomes, based on a comprehensive set of 381 markers, using multiple strategies. Our trees indicate remarkably closer evolutionary proximity between Archaea and Bacteria than previous estimates that were limited to fewer "core" genes, such as the ribosomal proteins.
View Article and Find Full Text PDF