The ENCODE Consortium's efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.
View Article and Find Full Text PDFThe Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the and genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community.
View Article and Find Full Text PDFThe majority of mammalian genes encode multiple transcript isoforms that result from differential promoter use, changes in exonic splicing, and alternative 3' end choice. Detecting and quantifying transcript isoforms across tissues, cell types, and species has been extremely challenging because transcripts are much longer than the short reads normally used for RNA-seq. By contrast, long-read RNA-seq (LR-RNA-seq) gives the complete structure of most transcripts.
View Article and Find Full Text PDFThe Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the and genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community.
View Article and Find Full Text PDFThe Encyclopedia of DNA Elements (ENCODE) web portal hosts genomic data generated by the ENCODE Consortium, Genomics of Gene Regulation, The NIH Roadmap Epigenomics Consortium, and the modENCODE and modERN projects. The goal of the ENCODE project is to build a comprehensive map of the functional elements of the human and mouse genomes. Currently, the portal database stores over 500 TB of raw and processed data from over 15,000 experiments spanning assays that measure gene expression, DNA accessibility, DNA and RNA binding, DNA methylation, and 3D chromatin structure across numerous cell lines, tissue types, and differentiation states with selected genetic and molecular perturbations.
View Article and Find Full Text PDFwas the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any available today.
View Article and Find Full Text PDFThe Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center has developed the ENCODE Portal database and website as the source for the data and metadata generated by the ENCODE Consortium. Two principles have motivated the design. First, experimental protocols, analytical procedures and the data themselves should be made publicly accessible through a coherent, web-based search and download interface.
View Article and Find Full Text PDFInvestigations aimed at defining the 3D configuration of eukaryotic chromosomes have consistently encountered an endogenous population of chromosome-derived circular genomic DNA, referred to as extrachromosomal circular DNA (eccDNA). While the production, distribution, and activities of eccDNAs remain understudied, eccDNA formation from specific regions of the linear genome has profound consequences on the regulatory and coding capabilities for these regions. Here, we define eccDNA distributions in and in three human cell types, utilizing a set of DNA topology-dependent approaches for enrichment and characterization.
View Article and Find Full Text PDFThe Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M.
View Article and Find Full Text PDFBackground: Identification of locus-locus contacts at the chromatin level provides a valuable foundation for understanding of nuclear architecture and function and a valuable tool for inferring long-range linkage relationships. As one approach to this, chromatin conformation capture-based techniques allow creation of genome spatial organization maps. While such approaches have been available for some time, methodological advances will be of considerable use in minimizing both time and input material required for successful application.
View Article and Find Full Text PDFThe Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects.
View Article and Find Full Text PDFThe Encyclopedia of DNA Elements (ENCODE) Project is in its third phase of creating a comprehensive catalog of functional elements in the human genome. This phase of the project includes an expansion of assays that measure diverse RNA populations, identify proteins that interact with RNA and DNA, probe regions of DNA hypersensitivity, and measure levels of DNA methylation in a wide range of cell and tissue types to identify putative regulatory elements. To date, results for almost 5000 experiments have been released for use by the scientific community.
View Article and Find Full Text PDFIn certain organisms, numbers of crossover events for any single chromosome are limited ("crossover interference") so that double crossover events are obtained at much lower frequencies than would be expected from the simple product of independent single-crossover events. We present a number of observations during which we examined interference over a large region of Caenorhabditis elegans chromosome V. Examining this region for multiple crossover events in heteroallelic configurations with limited dimorphism, we observed high levels of crossover interference in oocytes with only partial interference in spermatocytes.
View Article and Find Full Text PDFThe secondary structure of RNAs can be represented by graphs at various resolutions. While it was shown that RNA secondary structures can be represented by coarse grain tree-graphs and meaningful topological indices can be used to distinguish between various structures, small RNAs are needed to be represented by full graphs. No meaningful topological index has yet been suggested for the analysis of such type of RNA graphs.
View Article and Find Full Text PDFNucleic Acids Res
July 2011
RNA mutational analysis at the secondary-structure level can be useful to a wide-range of biological applications. It can be used to predict an optimal site for performing a nucleotide mutation at the single molecular level, as well as to analyze basic phenomena at the systems level. For the former, as more sequence modification experiments are performed that include site-directed mutagenesis to find and explore functional motifs in RNAs, a pre-processing step that helps guide in planning the experiment becomes vital.
View Article and Find Full Text PDFSummary: The DNA in eukaryotic cells is packed into the chromatin that is composed of nucleosomes. Positioning of the nucleosome core particles on the sequence is a problem of great interest because of the role nucleosomes play in different cellular processes including gene regulation. Using the sequence structure of 10.
View Article and Find Full Text PDFEnergy minimization methods for RNA secondary structure prediction have been used extensively for studying a variety of biological systems. Here, we demonstrate their applicability in riboswitch studies, exemplified in both the expression platform and aptamer domains. In the expression platform domain, energy minimization methods can be used to predict in silico a unique point mutation positioned in the non-conserved region of the TPP riboswitch that will transform it from a termination to an anti-termination state, thus backing the prediction experimentally.
View Article and Find Full Text PDFHeat shock proteins (HSPs) provide a useful system for studying developmental patterns in the digenetic Leishmania parasites, since their expression is induced in the mammalian life form. Translation regulation plays a key role in control of protein coding genes in trypanosomatids, and is directed exclusively by elements in the 3' untranslated region (UTR). Using sequential deletions of the Leishmania Hsp83 3' UTR (888 nucleotides [nt]), we mapped a region of 150 nt that was required, but not sufficient for preferential translation of a reporter gene at mammalian-like temperatures, suggesting that changes in RNA structure could be involved.
View Article and Find Full Text PDFThree-way junctions in folded RNAs have been investigated both experimentally and computationally. The interest in their analysis stems from the fact that they have significantly been found to possess a functional role. In recent work, three-way junctions have been categorized into families depending on the relative lengths of the segments linking the three helices.
View Article and Find Full Text PDFThe discovery of natural RNA sensors that respond to a change in the environment by a conformational switch can be utilized for various biotechnological and nanobiotechnological advances. One class of RNA sensors is the riboswitch: an RNA genetic control element that is capable of sensing small molecules, responding to a deviation in ligand concentration with a structural change. Riboswitches are modularly built from smaller components.
View Article and Find Full Text PDFEvolution of the triplet code is reconstructed on the basis of consensus temporal order of appearance of amino acids. Several important predictions are confirmed by computational sequence analyses. The earliest amino acids, alanine and glycine, have been encoded by GCC and GGC codons, as today.
View Article and Find Full Text PDFFrom recent developments of the early evolution theory it follows that the earliest mRNAs were short ( approximately 20 nt) (G+C)-rich polynucleotides. These short sequences could form hairpins, which would be of high evolutionary advantage because of stability and uniqueness of their conformations. Due to mutations accumulated during billions of years of evolution, the speculated earliest hairpins would largely lose the initial complementarities.
View Article and Find Full Text PDF