Cells are essential to understanding health and disease, yet traditional models fall short of modeling and simulating their function and behavior. Advances in AI and omics offer groundbreaking opportunities to create an AI virtual cell (AIVC), a multi-scale, multi-modal large-neural-network-based model that can represent and simulate the behavior of molecules, cells, and tissues across diverse states. This Perspective provides a vision on their design and how collaborative efforts to build AIVCs will transform biological research by allowing high-fidelity simulations, accelerating discoveries, and guiding experimental studies, offering new opportunities for understanding cellular functions and fostering interdisciplinary collaborations in open science.
View Article and Find Full Text PDFThe cell is arguably the most fundamental unit of life and is central to understanding biology. Accurate modeling of cells is important for this understanding as well as for determining the root causes of disease. Recent advances in artificial intelligence (AI), combined with the ability to generate large-scale experimental data, present novel opportunities to model cells.
View Article and Find Full Text PDFCheckpoint inhibitors (CPIs) targeting programmed death 1 (PD-1)/programmed death ligand 1 (PD-L1) and cytotoxic T lymphocyte antigen 4 (CTLA-4) have revolutionized cancer treatment but can trigger autoimmune complications, including CPI-induced diabetes mellitus (CPI-DM), which occurs preferentially with PD-1 blockade. We found evidence of pancreatic inflammation in patients with CPI-DM with shrinkage of pancreases, increased pancreatic enzymes, and in a case from a patient who died with CPI-DM, peri-islet lymphocytic infiltration. In the NOD mouse model, anti-PD-L1 but not anti-CTLA-4 induced diabetes rapidly.
View Article and Find Full Text PDFAbstract: Phenotypic plasticity describes the ability of cancer cells to undergo dynamic, nongenetic cell state changes that amplify cancer heterogeneity to promote metastasis and therapy evasion. Thus, cancer cells occupy a continuous spectrum of phenotypic states connected by trajectories defining dynamic transitions upon a cancer cell state landscape. With technologies proliferating to systematically record molecular mechanisms at single-cell resolution, we illuminate manifold learning techniques as emerging computational tools to effectively model cell state dynamics in a way that mimics our understanding of the cell state landscape.
View Article and Find Full Text PDFAs the biomedical community produces datasets that are increasingly complex and high dimensional, there is a need for more sophisticated computational tools to extract biological insights. We present Multiscale PHATE, a method that sweeps through all levels of data granularity to learn abstracted biological features directly predictive of disease outcome. Built on a coarse-graining process called diffusion condensation, Multiscale PHATE learns a data topology that can be analyzed at coarse resolutions for high-level summarizations of data and at fine resolutions for detailed representations of subsets.
View Article and Find Full Text PDFThe evolution of uniquely human traits likely entailed changes in developmental gene regulation. Human Accelerated Regions (HARs), which include transcriptional enhancers harboring a significant excess of human-specific sequence changes, are leading candidates for driving gene regulatory modifications in human development. However, insight into whether HARs alter the level, distribution, and timing of endogenous gene expression remains limited.
View Article and Find Full Text PDFOften when biological entities are measured in multiple ways, there are distinct categories of information: some information is easy-to-obtain information (EI) and can be gathered on virtually every subject of interest, while other information is hard-to-obtain information (HI) and can only be gathered on some. We propose building a model to make probabilistic predictions of HI using EI. Our feature mapping GAN (FMGAN), based on the conditional GAN framework, uses an embedding network to process conditions as part of the conditional GAN training to create manifold structure when it is not readily present in the conditions.
View Article and Find Full Text PDFCurrent methods for comparing single-cell RNA sequencing datasets collected in multiple conditions focus on discrete regions of the transcriptional state space, such as clusters of cells. Here we quantify the effects of perturbations at the single-cell level using a continuous measure of the effect of a perturbation across the transcriptomic space. We describe this space as a manifold and develop a relative likelihood estimate of observing each cell in each of the experimental conditions using graph signal processing.
View Article and Find Full Text PDFObesity is a major modifiable risk factor for pancreatic ductal adenocarcinoma (PDAC), yet how and when obesity contributes to PDAC progression is not well understood. Leveraging an autochthonous mouse model, we demonstrate a causal and reversible role for obesity in early PDAC progression, showing that obesity markedly enhances tumorigenesis, while genetic or dietary induction of weight loss intercepts cancer development. Molecular analyses of human and murine samples define microenvironmental consequences of obesity that foster tumorigenesis rather than new driver gene mutations, including significant pancreatic islet cell adaptation in obesity-associated tumors.
View Article and Find Full Text PDFAn amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDFAn amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDFCancer is a hyper-proliferative disease. Whether the proliferative state originates from the cell-of-origin or emerges later remains difficult to resolve. By tracking de novo transformation from normal hematopoietic progenitors expressing an acute myeloid leukemia (AML) oncogene MLL-AF9, we reveal that the cell cycle rate heterogeneity among granulocyte-macrophage progenitors (GMPs) determines their probability of transformation.
View Article and Find Full Text PDFThe high-dimensional data created by high-throughput technologies require visualization tools that reveal data structure and patterns in an intuitive form. We present PHATE, a visualization method that captures both local and global nonlinear structure using an information-geometric distance between data points. We compare PHATE to other tools on a variety of artificial and biological datasets, and find that it consistently preserves a range of patterns in data, including continual progressions, branches and clusters, better than other tools.
View Article and Find Full Text PDFAppl Environ Microbiol
November 2016
Unlabelled: As Earth's climate warms, soil carbon pools and the microbial communities that process them may change, altering the way in which carbon is recycled in soil. In this study, we used a combination of metagenomics and bacterial cultivation to evaluate the hypothesis that experimentally raising soil temperatures by 5°C for 5, 8, or 20 years increased the potential for temperate forest soil microbial communities to degrade carbohydrates. Warming decreased the proportion of carbohydrate-degrading genes in the organic horizon derived from eukaryotes and increased the fraction of genes in the mineral soil associated with Actinobacteria in all studies.
View Article and Find Full Text PDF