Plasmids alter microbial evolution and lifestyles by mobilizing genes that often confer fitness in changing environments across clades. Yet our ecological and evolutionary understanding of naturally occurring plasmids is far from complete. Here we developed a machine-learning model, PlasX, which identified 68,350 non-redundant plasmids across human gut metagenomes and organized them into 1,169 evolutionarily cohesive 'plasmid systems' using our sequence containment-aware network-partitioning algorithm, MobMess.
View Article and Find Full Text PDFPlasmids are extrachromosomal genetic elements that often encode fitness-enhancing features. However, many bacteria carry "cryptic" plasmids that do not confer clear beneficial functions. We identified one such cryptic plasmid, pBI143, which is ubiquitous across industrialized gut microbiomes and is 14 times as numerous as crAssphage, currently established as the most abundant extrachromosomal genetic element in the human gut.
View Article and Find Full Text PDFA wide variety of human diseases are associated with loss of microbial diversity in the human gut, inspiring a great interest in the diagnostic or therapeutic potential of the microbiota. However, the ecological forces that drive diversity reduction in disease states remain unclear, rendering it difficult to ascertain the role of the microbiota in disease emergence or severity. One hypothesis to explain this phenomenon is that microbial diversity is diminished as disease states select for microbial populations that are more fit to survive environmental stress caused by inflammation or other host factors.
View Article and Find Full Text PDFBackground: Changes in microbial community composition as a function of human health and disease states have sparked remarkable interest in the human gut microbiome. However, establishing reproducible insights into the determinants of microbial succession in disease has been a formidable challenge.
Results: Here we use fecal microbiota transplantation (FMT) as an in natura experimental model to investigate the association between metabolic independence and resilience in stressed gut environments.
Plasmids are extrachromosomal genetic elements that often encode fitness enhancing features. However, many bacteria carry 'cryptic' plasmids that do not confer clear beneficial functions. We identified one such cryptic plasmid, pBI143, which is ubiquitous across industrialized gut microbiomes, and is 14 times as numerous as crAssphage, currently established as the most abundant genetic element in the human gut.
View Article and Find Full Text PDFA major goal of cancer research is to understand how mutations distributed across diverse genes affect common cellular systems, including multiprotein complexes and assemblies. Two challenges—how to comprehensively map such systems and how to identify which are under mutational selection—have hindered this understanding. Accordingly, we created a comprehensive map of cancer protein systems integrating both new and published multi-omic interaction data at multiple scales of analysis.
View Article and Find Full Text PDFRecent studies of the tumor genome seek to identify cancer pathways as groups of genes in which mutations are epistatic with one another or, specifically, "mutually exclusive." Here, we show that most mutations are mutually exclusive not due to pathway structure but to interactions with disease subtype and tumor mutation load. In particular, many cancer driver genes are mutated preferentially in tumors with few mutations overall, causing mutations in these cancer genes to appear mutually exclusive with numerous others.
View Article and Find Full Text PDFSystems biology requires not only genome-scale data but also methods to integrate these data into interpretable models. Previously, we developed approaches that organize omics data into a structured hierarchy of cellular components and pathways, called a "data-driven ontology." Such hierarchies recapitulate known cellular subsystems and discover new ones.
View Article and Find Full Text PDFA major ambition of artificial intelligence lies in translating patient data to successful therapies. Machine learning models face particular challenges in biomedicine, however, including handling of extreme data heterogeneity and lack of mechanistic insight into predictions. Here, we argue for "visible" approaches that guide model structure with experimental biology.
View Article and Find Full Text PDFAlthough cancer genomes are replete with noncoding mutations, the effects of these mutations remain poorly characterized. Here we perform an integrative analysis of 930 tumor whole genomes and matched transcriptomes, identifying a network of 193 noncoding loci in which mutations disrupt target gene expression. These 'somatic eQTLs' (expression quantitative trait loci) are frequently mutated in specific cancer tissues, and the majority can be validated in an independent cohort of 3,382 tumors.
View Article and Find Full Text PDFAlthough artificial neural networks are powerful classifiers, their internal structures are hard to interpret. In the life sciences, extensive knowledge of cell biology provides an opportunity to design visible neural networks (VNNs) that couple the model's inner workings to those of real systems. Here we develop DCell, a VNN embedded in the hierarchical structure of 2,526 subsystems comprising a eukaryotic cell (http://d-cell.
View Article and Find Full Text PDFPac Symp Biocomput
August 2018
Analysis of patient genomes and transcriptomes routinely recognizes new gene sets associated with human disease. Here we present an integrative natural language processing system which infers common functions for a gene set through automatic mining of the scientific literature with biological networks. This system links genes with associated literature phrases and combines these links with protein interactions in a single heterogeneous network.
View Article and Find Full Text PDFBackground: Approximately 12% of all ureteral stents placed are retained or "forgotten." Forgotten stents are associated with significant safety concerns as well as increased costs and legal issues. Retained ureteral stents (RUS) often occur due to lack of clinical follow-up, communication or language barriers, and economic concerns.
View Article and Find Full Text PDFBackground: Global but predictable changes impact the DNA methylome as we age, acting as a type of molecular clock. This clock can be hastened by conditions that decrease lifespan, raising the question of whether it can also be slowed, for example, by conditions that increase lifespan. Mice are particularly appealing organisms for studies of mammalian aging; however, epigenetic clocks have thus far been formulated only in humans.
View Article and Find Full Text PDFAccurately translating genotype to phenotype requires accounting for the functional impact of genetic variation at many biological scales. Here we present a strategy for genotype-phenotype reasoning based on existing knowledge of cellular subsystems. These subsystems and their hierarchical organization are defined by the Gene Ontology or a complementary ontology inferred directly from previously published datasets.
View Article and Find Full Text PDF