Publications by authors named "Daniel Zerbino"

Objective: Epigenetic mechanisms, including DNA methylation (DNAm), have been proposed to play a key role in Crohn's disease (CD) pathogenesis. However, the specific cell types and pathways affected as well as their potential impact on disease phenotype and outcome remain unknown. We set out to investigate the role of intestinal epithelial DNAm in CD pathogenesis.

View Article and Find Full Text PDF

Background & Aims: Human intestinal epithelial organoids (IEOs) are a powerful tool to model major aspects of intestinal development, health, and diseases because patient-derived cultures retain many features found in vivo. A necessary aspect of the organoid model is the requirement to expand cultures in vitro through several rounds of passaging. This is of concern because the passaging of cells has been shown to affect cell morphology, ploidy, and function.

View Article and Find Full Text PDF

Ensembl (https://www.ensembl.org) is unique in its flexible infrastructure for access to genomic data and annotation.

View Article and Find Full Text PDF

As computational modeling becomes more essential to analyze and understand biological regulatory mechanisms, governance of the many databases and knowledge bases that support this domain is crucial to guarantee reliability and interoperability of resources. To address this, the COST Action Gene Regulation Ensemble Effort for the Knowledge Commons (GREEKC, CA15205, www.greekc.

View Article and Find Full Text PDF

The COVID-19 pandemic has seen unprecedented use of SARS-CoV-2 genome sequencing for epidemiological tracking and identification of emerging variants. Understanding the potential impact of these variants on the infectivity of the virus and the efficacy of emerging therapeutics and vaccines has become a cornerstone of the fight against the disease. To support the maximal use of genomic information for SARS-CoV-2 research, we launched the Ensembl COVID-19 browser; the first virus to be encompassed within the Ensembl platform.

View Article and Find Full Text PDF

Many gene expression quantitative trait locus (eQTL) studies have published their summary statistics, which can be used to gain insight into complex human traits by downstream analyses, such as fine mapping and co-localization. However, technical differences between these datasets are a barrier to their widespread use. Consequently, target genes for most genome-wide association study (GWAS) signals have still not been identified.

View Article and Find Full Text PDF

The Sequence Ontology (SO) is a structured, controlled vocabulary that provides terms and definitions for genomic annotation. The Gene Regulation Ensemble Effort for the Knowledge Commons (GREEKC) initiative has gathered input from many groups of researchers, including the SO, the Gene Ontology (GO), and gene regulation experts, with the goal of curating information about how gene expression is regulated at the molecular level. Here we discuss recent updates to the SO reflecting current knowledge.

View Article and Find Full Text PDF

Many types of data from genomic analyses can be represented as genomic tracks, features linked to the genomic coordinates of a reference genome. Examples of such data are epigenetic DNA methylation data, ChIP-seq peaks, germline or somatic DNA variants, as well as RNA-seq expression levels. Researchers often face difficulties in locating, accessing and combining relevant tracks from external sources, as well as locating the raw data, reducing the value of the generated information.

View Article and Find Full Text PDF

The Functional Annotation of ANimal Genomes (FAANG) project is a worldwide coordinated action creating high-quality functional annotation of farmed and companion animal genomes. The generation of a rich genome-to-phenome resource and supporting informatic infrastructure advances the scope of comparative genomics and furthers the understanding of functional elements. The project also provides terrestrial and aquatic animal agriculture community powerful resources for supporting improvements to farmed animal production, disease resistance, and genetic diversity.

View Article and Find Full Text PDF

The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs.

View Article and Find Full Text PDF

The Ensembl project (https://www.ensembl.org) annotates genomes and disseminates genomic data for vertebrate species.

View Article and Find Full Text PDF
Article Synopsis
  • Researchers aimed to validate CD8 T cell gene expression patterns related to clinical outcomes in children with inflammatory bowel disease (IBD) based on previous adult studies.
  • They analyzed blood samples from 112 children newly diagnosed with IBD and 19 controls, examining gene expression and DNA methylation profiles.
  • The study found no correlation between CD8 T-cell signatures and disease outcomes in pediatric or additional adult cohorts, highlighting the difficulty in establishing reliable prognostic biomarkers for IBD.
View Article and Find Full Text PDF
Perspectives on ENCODE.

Nature

July 2020

The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression.

View Article and Find Full Text PDF

Our understanding of the human genome has continuously expanded since its draft publication in 2001. Over the years, novel assays have allowed us to progressively overlay layers of knowledge above the raw sequence of A's, T's, G's, and C's. The reference human genome sequence is now a complex knowledge base maintained under the shared stewardship of multiple specialist communities.

View Article and Find Full Text PDF

Motivation: Genome-wide association studies (GWAS) are a powerful method to detect even weak associations between variants and phenotypes; however, many of the identified associated variants are in non-coding regions, and presumably influence gene expression regulation. Identifying potential drug targets, i.e.

View Article and Find Full Text PDF

The Ensembl (https://www.ensembl.org) is a system for generating and distributing genome annotation such as genes, variation, regulation and comparative genomics across the vertebrate subphylum and key model organisms.

View Article and Find Full Text PDF

Motivation: Compared to traditional haploid reference genomes, graph genomes are an efficient and compact data structure for storing multiple genomic sequences, for storing polymorphisms or for mapping sequencing reads with greater sensitivity. Further, graphs are well-studied computer science objects that can be efficiently analyzed. However, their adoption in genomic research is slow, in part because of the cognitive difficulty in interpreting graphs.

View Article and Find Full Text PDF

The Ensembl project (https://www.ensembl.org) makes key genomic data sets available to the entire scientific community without restrictions.

View Article and Find Full Text PDF

The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation.

View Article and Find Full Text PDF

The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.

View Article and Find Full Text PDF

Unlabelled: The Ensembl software resources are a stable infrastructure to store, access and manipulate genome assemblies and their functional annotations. The Ensembl 'Core' database and Application Programming Interface (API) was our first major piece of software infrastructure and remains at the centre of all of our genome resources. Since its initial design more than fifteen years ago, the number of publicly available genomic, transcriptomic and proteomic datasets has grown enormously, accelerated by continuous advances in DNA-sequencing technology.

View Article and Find Full Text PDF

Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes.

View Article and Find Full Text PDF

Long-range interactions between regulatory elements and gene promoters play key roles in transcriptional regulation. The vast majority of interactions are uncharted, constituting a major missing link in understanding genome control. Here, we use promoter capture Hi-C to identify interacting regions of 31,253 promoters in 17 human primary hematopoietic cell types.

View Article and Find Full Text PDF

Background: The study of genomic variation has provided key insights into the functional role of mutations. Predominantly, studies have focused on single nucleotide variants (SNV), which are relatively easy to detect and can be described with rich mathematical models. However, it has been observed that genomes are highly plastic, and that whole regions can be moved, removed or duplicated in bulk.

View Article and Find Full Text PDF