Publications by authors named "Barbara Wold"

Striving to build an exhaustive guidebook of the types and properties of human cells, the Human Cell Atlas' (HCA) success relies on the sampling of diverse populations, developmental stages, and tissue types. Its open science philosophy preconizes the rapid, seamless sharing of data - as openly as possible. In light of the scope and ambition of such an international initiative, the HCA Ethics Working Group (EWG) has been working to build a solid foundation to address the complexities of data collection and sharing as part of Atlas development.

View Article and Find Full Text PDF

The Human Cell Atlas (HCA) is a global partnership "to create comprehensive reference maps of all human cells-the fundamental units of life - as a basis for both understanding human health and diagnosing, monitoring, and treating disease." ( https://www.humancellatlas.

View Article and Find Full Text PDF

Kidney injury disrupts the intricate renal architecture and triggers limited regeneration, together with injury-invoked inflammation and fibrosis. Deciphering the molecular pathways and cellular interactions driving these processes is challenging due to the complex tissue structure. Here, we apply single cell spatial transcriptomics to examine ischemia-reperfusion injury in the mouse kidney.

View Article and Find Full Text PDF
Article Synopsis
  • Advances in RNA abundance quantification have been largely driven by high-throughput "short-read" technologies, but accurately quantifying full-length transcript isoforms has remained difficult.
  • Long-read sequencing platforms offer the potential for better isoform quantification, but bioinformatic challenges arise from the complexity of isoforms and genetic variation.
  • To address this, the study introduces 'lr-kallisto', a new tool that adapts existing RNA-seq quantification methods for long-read data, improving accuracy through exome capture.
View Article and Find Full Text PDF

Postnatal genomic regulation significantly influences tissue and organ maturation but is under-studied relative to existing genomic catalogs of adult tissues or prenatal development in mouse. The ENCODE4 consortium generated the first comprehensive single-nucleus resource of postnatal regulatory events across a diverse set of mouse tissues. The collection spans seven postnatal time points, mirroring human development from childhood to adulthood, and encompasses five core tissues.

View Article and Find Full Text PDF
Article Synopsis
  • * They generated over 427 million long-read sequences and found that longer, more accurate sequences yield better transcript detection, while increased read depth enhances quantification.
  • * The study suggests that using reference-based tools works best for well-annotated genomes and recommends incorporating extra data to better identify rare transcripts, providing a benchmark for improving transcriptome analysis techniques in the future.
View Article and Find Full Text PDF

The gene expression profiles of distinct cell types reflect complex genomic interactions among multiple simultaneous biological processes within each cell that can be altered by disease progression as well as genetic background. The identification of these active cellular programs is an open challenge in the analysis of single-cell RNA-seq data. Latent Dirichlet Allocation (LDA) is a generative method used to identify recurring patterns in counts data, commonly referred to as topics that can be used to interpret the state of each cell.

View Article and Find Full Text PDF

Human limbs emerge during the fourth post-conception week as mesenchymal buds, which develop into fully formed limbs over the subsequent months. This process is orchestrated by numerous temporally and spatially restricted gene expression programmes, making congenital alterations in phenotype common. Decades of work with model organisms have defined the fundamental mechanisms underlying vertebrate limb development, but an in-depth characterization of this process in humans has yet to be performed.

View Article and Find Full Text PDF
Article Synopsis
  • Transcription factors (TFs) are proteins that bind to DNA regulatory elements to regulate gene expression; the study focused on 680 DNA-associated proteins in the HepG2 cancer cell line.
  • Researchers modeled the impact of these TFs on gene expression, identifying 26 potential transcriptional repressors and analyzing high occupancy target sites in the context of genome organization.
  • The findings revealed closed chromatin regions with multiple TFs bound and highlighted a pair of TFs (MAFF/MAFK) linked to transcriptional repression, offering new insights into gene regulation in HepG2 cells.
View Article and Find Full Text PDF
Article Synopsis
  • The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium aimed to evaluate long-read sequencing for analyzing transcripts by generating over 427 million sequences from various species.
  • The findings highlighted that longer, accurate sequences yield better transcript identification, while increased read depth enhances quantification accuracy, particularly in well-annotated genomes.
  • The study serves as a benchmark for transcriptome analysis strategies and suggests using additional data for detecting rare transcripts or employing reference-free methods.
View Article and Find Full Text PDF

The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the and genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community.

View Article and Find Full Text PDF

The majority of mammalian genes encode multiple transcript isoforms that result from differential promoter use, changes in exonic splicing, and alternative 3' end choice. Detecting and quantifying transcript isoforms across tissues, cell types, and species has been extremely challenging because transcripts are much longer than the short reads normally used for RNA-seq. By contrast, long-read RNA-seq (LR-RNA-seq) gives the complete structure of most transcripts.

View Article and Find Full Text PDF

The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the and genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community.

View Article and Find Full Text PDF
Article Synopsis
  • * A deep-learning model can predict allele-specific activity using only local nucleotide sequences, emphasizing key transcription-factor-binding motifs affected by genetic variants.
  • * Combining EN-TEx with previous genome annotations shows significant connections between allele-specific loci and GWAS loci, and aids in transferring known eQTLs to challenging tissue types, improving personal functional genomics research.
View Article and Find Full Text PDF

Summary: Large-scale sharing of genomic quantification data requires standardized access interfaces. In this Global Alliance for Genomics and Health project, we developed RNAget, an API for secure access to genomic quantification data in matrix form. RNAget provides for slicing matrices to extract desired subsets of data and is applicable to all expression matrix-format data, including RNA sequencing and microarrays.

View Article and Find Full Text PDF

The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution.

View Article and Find Full Text PDF

The rise in throughput and quality of long-read sequencing should allow unambiguous identification of full-length transcript isoforms. However, its application to single-cell RNA-seq has been limited by throughput and expense. Here we develop and characterize long-read Split-seq (LR-Split-seq), which uses combinatorial barcoding to sequence single cells with long reads.

View Article and Find Full Text PDF

The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development.

View Article and Find Full Text PDF

During mammalian embryogenesis, differential gene expression gradually builds the identity and complexity of each tissue and organ system. Here we systematically quantified mouse polyA-RNA from day 10.5 of embryonic development to birth, sampling 17 tissues and organs.

View Article and Find Full Text PDF

Transcription factors are DNA-binding proteins that have key roles in gene regulation. Genome-wide occupancy maps of transcriptional regulators are important for understanding gene regulation and its effects on diverse biological processes. However, only a minority of the more than 1,600 transcription factors encoded in the human genome has been assayed.

View Article and Find Full Text PDF

DNA-associated proteins (DAPs) classically regulate gene expression by binding to regulatory loci such as enhancers or promoters. As expanding catalogs of genome-wide DAP binding maps reveal thousands of loci that, unlike the majority of conventional enhancers and promoters, associate with dozens of different DAPs with apparently little regard for motif preference, an understanding of DAP association and coordination at such regulatory loci is essential to deciphering how these regions contribute to normal development and disease. In this study, we aggregated publicly available ChIP-seq data from 469 human DAPs assayed in three cell lines and integrated these data with an orthogonal data set of 352 nonredundant, in vitro-derived motifs mapped to the genome within DNase I hypersensitivity footprints to characterize regions with high numbers of DAP associations.

View Article and Find Full Text PDF

MicroRNAs (miRNAs) play a critical role as posttranscriptional regulators of gene expression. The ENCODE Project profiled the expression of miRNAs in an extensive set of organs during a time-course of mouse embryonic development and captured the expression dynamics of 785 miRNAs. We found distinct organ-specific and developmental stage-specific miRNA expression clusters, with an overall pattern of increasing organ-specific expression as embryonic development proceeds.

View Article and Find Full Text PDF

Intrathymic T cell development converts multipotent precursors to committed pro-T cells, silencing progenitor genes while inducing T cell genes, but the underlying steps have remained obscure. Single-cell profiling was used to define the order of regulatory changes, employing single-cell RNA sequencing (scRNA-seq) for full-transcriptome analysis, plus sequential multiplexed single-molecule fluorescent in situ hybridization (seqFISH) to quantitate functionally important transcripts in intrathymic precursors. Single-cell cloning verified high T cell precursor frequency among the immunophenotypically defined "early T cell precursor" (ETP) population; a discrete committed granulocyte precursor subset was also distinguished.

View Article and Find Full Text PDF