85 results match your criteria: "Arc Institute.[Affiliation]"

How to build the virtual cell with artificial intelligence: Priorities and opportunities.

Cell

December 2024

Chan Zuckerberg Initiative, Redwood City, CA, USA; Department of Bioengineering, Stanford University, Stanford, CA, USA; Department of Applied Physics, Stanford University, Stanford, CA, USA. Electronic address:

Cells are essential to understanding health and disease, yet traditional models fall short of modeling and simulating their function and behavior. Advances in AI and omics offer groundbreaking opportunities to create an AI virtual cell (AIVC), a multi-scale, multi-modal large-neural-network-based model that can represent and simulate the behavior of molecules, cells, and tissues across diverse states. This Perspective provides a vision on their design and how collaborative efforts to build AIVCs will transform biological research by allowing high-fidelity simulations, accelerating discoveries, and guiding experimental studies, offering new opportunities for understanding cellular functions and fostering interdisciplinary collaborations in open science.

View Article and Find Full Text PDF

The bacterial retron reverse transcriptase system has served as an intracellular factory for single-stranded DNA in many biotechnological applications. In these technologies, a natural retron non-coding RNA (ncRNA) is modified to encode a template for the production of custom DNA sequences by reverse transcription. The efficiency of reverse transcription is a major limiting step for retron technologies, but we lack systematic knowledge of how to improve or maintain reverse transcription efficiency while changing the retron sequence for custom DNA production.

View Article and Find Full Text PDF

Organ function requires coordinated activities of thousands of genes in distinct, spatially organized cell types. Understanding the basis of emergent tissue function requires approaches to dissect the genetic control of diverse cellular and tissue phenotypes . Here, we develop paired imaging and sequencing methods to construct large-scale, multi-modal genotype-phenotypes maps in tissue with pooled genetic perturbations.

View Article and Find Full Text PDF
Article Synopsis
  • * The study utilizes a generative AI model called Orion to analyze blood samples from 1,050 individuals with non-small cell lung cancer (NSCLC) and matched controls, focusing on orphan non-coding RNAs.
  • * Orion significantly outperforms traditional methods, achieving 94% sensitivity and 87% specificity in cancer detection, and shows over 30% higher sensitivity on validation datasets compared to other approaches.
View Article and Find Full Text PDF

Histone proteins have traditionally been thought to be restricted to eukaryotes and most archaea, with eukaryotic nucleosomal histones deriving from their archaeal ancestors. In contrast, bacteria lack histones as a rule. However, histone proteins have recently been identified in a few bacterial clades, most notably the phylum Bdellovibrionota, and these histones have been proposed to exhibit a range of divergent features compared to histones in archaea and eukaryotes.

View Article and Find Full Text PDF

Understanding how human gene expression is coordinately regulated by functional units of proteins across the genome remains a major biological goal. Here, we present COMET, a high-throughput screening platform for combinatorial effector targeting for the identification of transcriptional modulators. We generate libraries of combinatorial dCas9-based fusion proteins, containing two to six effector domains, allowing us to systematically investigate more than 110,000 combinations of effector proteins at endogenous human loci for their influence on transcription.

View Article and Find Full Text PDF

The genome is a sequence that encodes the DNA, RNA, and proteins that orchestrate an organism's function. We present Evo, a long-context genomic foundation model with a frontier architecture trained on millions of prokaryotic and phage genomes, and report scaling laws on DNA to complement observations in language and vision. Evo generalizes across DNA, RNA, and proteins, enabling zero-shot function prediction competitive with domain-specific language models and the generation of functional CRISPR-Cas and transposon systems, representing the first examples of protein-RNA and protein-DNA codesign with a language model.

View Article and Find Full Text PDF

The search for new approaches in cancer therapy requires a mechanistic understanding of cancer vulnerabilities and anti-cancer drug mechanisms of action. Problematically, some effective therapeutics target cancer vulnerabilities that have poorly defined mechanisms of anti-cancer activity. One such drug is decitabine, a frontline therapeutic approved for the treatment of high-risk acute myeloid leukemia (AML).

View Article and Find Full Text PDF

Effective genome editing requires a sufficient dose of CRISPR-Cas9 ribonucleoproteins (RNPs) to enter the target cell while minimizing immune responses, off-target editing and cytotoxicity. Clinical use of Cas9 RNPs currently entails electroporation into cells , but no systematic comparison of this method to packaged RNP delivery has been made. Here we compared two delivery strategies, electroporation and enveloped delivery vehicles (EDVs), to investigate the Cas9 dosage requirements for genome editing.

View Article and Find Full Text PDF

Mutations in lysosomal genes cause neurodegeneration and neurological lysosomal storage disorders (LSDs). Despite their essential role in brain homeostasis, the cell-type-specific composition and function of lysosomes remain poorly understood. Here, we report a quantitative protein atlas of the lysosome from mouse neurons, astrocytes, oligodendrocytes, and microglia.

View Article and Find Full Text PDF

In the canonical genetic code, many amino acids are assigned more than one codon. Work by us and others has shown that the choice of these synonymous codon is not random, and carries regulatory and functional consequences. Existing protein foundation models ignore this context-dependent role of coding sequence in shaping the protein landscape of the cell.

View Article and Find Full Text PDF

The cell is arguably the most fundamental unit of life and is central to understanding biology. Accurate modeling of cells is important for this understanding as well as for determining the root causes of disease. Recent advances in artificial intelligence (AI), combined with the ability to generate large-scale experimental data, present novel opportunities to model cells.

View Article and Find Full Text PDF
Article Synopsis
  • The study examines the roles of the hippocampus and prefrontal cortex in learning and cognition, focusing on their molecular development through innovative genomic techniques.
  • Researchers used over 53,000 single-nucleus profiles to analyze DNA methylation and chromatin conformation changes, finding that these processes occur at different times during development.
  • The findings reveal distinct chromatin interactions in neurons versus glial cells and identify specific genetic variants associated with schizophrenia, highlighting the potential of single-cell multi-omics in understanding brain development and neuropsychiatric disorders.
View Article and Find Full Text PDF

Background: Genetic perturbation screens with single-cell readouts have enabled rich phenotyping of gene function and regulatory networks. These approaches have been challenging in vivo, especially in adult disease models such as cancer, which include mixtures of malignant and microenvironment cells. Glioblastoma (GBM) is a fatal cancer, and methods of systematically interrogating gene function and therapeutic targets in vivo, especially in combination with standard of care treatment such as radiotherapy, are lacking.

View Article and Find Full Text PDF

Integrative identification of non-coding regulatory regions driving metastatic prostate cancer.

Cell Rep

September 2024

Department of Biochemistry & Biophysics, University of California, San Francisco, San Francisco, CA, USA; Department of Urology, University of California, San Francisco, San Francisco, CA, USA; Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, USA; Arc Institute, Palo Alto, CA 94305, USA; Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA. Electronic address:

Large-scale sequencing efforts have been undertaken to understand the mutational landscape of the coding genome. However, the vast majority of variants occur within non-coding genomic regions. We designed an integrative computational and experimental framework to identify recurrently mutated non-coding regulatory regions that drive tumor progression.

View Article and Find Full Text PDF
Article Synopsis
  • - Kidney failure significantly impacts health, prompting a large-scale study of 406,504 participants to uncover genetic factors affecting kidney function, identifying 430 key genetic loci.
  • - The research revealed that 56% of inherited differences in kidney function are linked to regulatory elements in kidney tubule epithelial cells, while 7% relate to podocyte cells, suggesting these are crucial for gene expression.
  • - Further analysis using advanced techniques like enhancer assays and CRISPRi identified specific genes (NDRG1, CCNB1, and STC1) regulated by these genetic loci, shedding light on their roles in kidney function.
View Article and Find Full Text PDF

Inferring the driving regulatory programs from comparative analysis of gene expression data is a cornerstone of systems biology. Many computational frameworks were developed to address this problem, including our iPAGE (information-theoretic Pathway Analysis of Gene Expression) toolset that uses information theory to detect non-random patterns of expression associated with given pathways or regulons. Our recent observations, however, indicate that existing approaches are susceptible to the technical biases that are inherent to most real world annotations.

View Article and Find Full Text PDF
Article Synopsis
  • The study investigates the molecular pathways that contribute to metastasis in pancreatic ductal adenocarcinoma (PDAC) by labeling and analyzing tumor subclones to understand their pre-metastatic states.
  • Researchers identified distinct gene signatures for subclones with high and low metastatic potential, revealing specific pathway activations, such as the IL-1 pathway in metastasis-high subclones.
  • They found that human PDAC tumors exhibit similar metastatic potential signatures, with higher scores linked to worse patient prognosis, underscoring the potential conservation of these pathways across various epithelial cancers.
View Article and Find Full Text PDF

Enhancer of zeste homolog 2 (EZH2) is a histone methyltransferase and emerging therapeutic target that is overexpressed in most castration-resistant prostate cancers and implicated as a driver of disease progression and resistance to hormonal therapies. Here we define the lineage-specific action and differential activity of EZH2 in both prostate adenocarcinoma and neuroendocrine prostate cancer (NEPC) subtypes of advanced prostate cancer to better understand the role of EZH2 in modulating differentiation, lineage plasticity, and to identify mediators of response and resistance to EZH2 inhibitor therapy. Mechanistically, EZH2 modulates bivalent genes that results in upregulation of NEPC-associated transcriptional drivers (e.

View Article and Find Full Text PDF

Taurine is a conditionally essential micronutrient and one of the most abundant amino acids in humans. In endogenous taurine metabolism, dedicated enzymes are involved in the biosynthesis of taurine from cysteine and in the downstream metabolism of secondary taurine metabolites. One taurine metabolite is N-acetyltaurine.

View Article and Find Full Text PDF

Accurate quantification of transcript isoforms is crucial for understanding gene regulation, functional diversity, and cellular behavior. Existing RNA sequencing methods have significant limitations: short-read (SR) sequencing provides high depth but struggles with isoform deconvolution, whereas long-read (LR) sequencing offers isoform resolution at the cost of lower depth, higher noise, and technical biases. Addressing this gap, we introduce Multi-Platform Aggregation and Quantification of Transcripts (MPAQT), a generative model that combines the complementary strengths of different sequencing platforms to achieve state-of-the-art isoform-resolved transcript quantification, as demonstrated by extensive simulations and experimental benchmarks.

View Article and Find Full Text PDF

The bacterial retron reverse transcriptase system has served as an intracellular factory for single-stranded DNA in many biotechnological applications. In these technologies, a natural retron non-coding RNA (ncRNA) is modified to encode a template for the production of custom DNA sequences by reverse transcription. The efficiency of reverse transcription is a major limiting step for retron technologies, but we lack systematic knowledge of how to improve or maintain reverse transcription efficiency while changing the retron sequence for custom DNA production.

View Article and Find Full Text PDF

A systematic search for RNA structural switches across the human transcriptome.

Nat Methods

September 2024

Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA.

RNA structural switches are key regulators of gene expression in bacteria, but their characterization in Metazoa remains limited. Here, we present SwitchSeeker, a comprehensive computational and experimental approach for systematic identification of functional RNA structural switches. We applied SwitchSeeker to the human transcriptome and identified 245 putative RNA switches.

View Article and Find Full Text PDF
Article Synopsis
  • The study examines the processes regulating mRNA in cells by tracking its movement through different compartments using a technique called subcellular TimeLapse-seq.
  • The findings reveal that transcripts from functionally similar genes exhibit comparable movement rates, and a relationship between the DDX3X protein and nuclear export of RNA is established.
  • The research also shows that mRNA with longer chromatin residency tends to have longer poly(A) tails, while machine learning techniques were used to predict the various lifecycles of these mRNAs based on their molecular characteristics.
View Article and Find Full Text PDF