Characterizing transcription factor (TF) genomic colocalization is essential for identifying cooperative binding of TFs in controlling gene expression. Here, we introduce a protocol for using PAD2, an interactive web application that enables the investigation of colocalization of various TFs and chromatin-regulating proteins from mouse embryonic stem cells at various functional genomic regions. We describe steps for accessing and searching the PAD2 database and selecting and submitting genomic regions.
View Article and Find Full Text PDFLysine-specific demethylase 1 (LSD1) is well-known for its role in decommissioning enhancers during mouse embryonic stem cell (ESC) differentiation. Its role in gene promoters remains poorly understood despite its widespread presence at these sites. Here, we report that LSD1 promotes RNA polymerase II (RNAPII) pausing, a rate-limiting step in transcription regulation, in ESCs.
View Article and Find Full Text PDFHistone modifications impact final splicing decisions. However, there is little evidence of the driving role of these marks in inducing cell-specific splicing changes. Using CRISPR epigenome editing tools, we show in an epithelial-to-mesenchymal cell reprogramming system (epithelial-to-mesenchymal transition [EMT]) that a single change in H3K27ac or H3K27me3 levels right at the alternatively spliced exon is necessary and sufficient to induce a splicing change capable of recapitulating important aspects of EMT, such as cell motility and invasiveness.
View Article and Find Full Text PDFGene silencing by heterochromatin plays a crucial role in cell identity. Here, we characterize the localization, the biogenesis, and the function of an atypical heterochromatin, which is simultaneously enriched in the typical H3K9me3 mark and in H3K36me3, a histone mark usually associated with gene expression. We identified thousands of dual regions in mouse embryonic stem (ES) cells that rely on the histone methyltransferases SET domain bifurcated 1 (SETDB1) and nuclear set domain (NSD)-containing proteins to generate H3K9me3 and H3K36me3, respectively.
View Article and Find Full Text PDFAccurate quantification and detection of intron retention levels require specialized software. Building on our previous software, we create a suite of tools called IRFinder-S, to analyze and explore intron retention events in multiple samples. Specifically, IRFinder-S allows a better identification of true intron retention events using a convolutional neural network, allows the sharing of intron retention results between labs, integrates a dynamic database to explore and contrast available samples, and provides a tested method to detect differential levels of intron retention.
View Article and Find Full Text PDFBivalent chromatin is characterized by the simultaneous presence of H3K4me3 and H3K27me3, histone modifications generally associated with transcriptionally active and repressed chromatin, respectively. Prevalent in embryonic stem cells (ESCs), bivalency is postulated to poise/prime lineage-controlling developmental genes for rapid activation during embryogenesis while maintaining a transcriptionally repressed state in the absence of activation cues; however, this hypothesis remains to be directly tested. Most gene promoters DNA hypermethylated in adult human cancers are bivalently marked in ESCs, and it was speculated that bivalency predisposes them for aberrant de novo DNA methylation and irreversible silencing in cancer, but evidence supporting this model is largely lacking.
View Article and Find Full Text PDFMotivation: Long-read sequencing technologies are invaluable for determining complex RNA transcript architectures but are error-prone. Numerous 'hybrid correction' algorithms have been developed for genomic data that correct long reads by exploiting the accuracy and depth of short reads sequenced from the same sample. These algorithms are not suited for correcting more complex transcriptome sequencing data.
View Article and Find Full Text PDFThe developmental potential of cells, termed pluripotency, is highly dynamic and progresses through a continuum of naive, formative and primed states. Pluripotency progression of mouse embryonic stem cells (ESCs) from naive to formative and primed state is governed by transcription factors (TFs) and their target genes. Genomic techniques have uncovered a multitude of TF binding sites in ESCs, yet a major challenge lies in identifying target genes from functional binding sites and reconstructing dynamic transcriptional networks underlying pluripotency progression.
View Article and Find Full Text PDFFaithful transcription initiation is critical for accurate gene expression, yet the mechanisms underlying specific transcription start site (TSS) selection in mammals remain unclear. Here, we show that the histone-fold domain protein NF-Y, a ubiquitously expressed transcription factor, controls the fidelity of transcription initiation at gene promoters in mouse embryonic stem cells. We report that NF-Y maintains the region upstream of TSSs in a nucleosome-depleted state while simultaneously protecting this accessible region against aberrant and/or ectopic transcription initiation.
View Article and Find Full Text PDFPluripotency is highly dynamic and progresses through a continuum of pluripotent stem cell states. The two states that bookend the pluripotency continuum, naive and primed, are well characterized, but our understanding of the intermediate states and transitions between them remains incomplete. Here, we dissect the dynamics of pluripotent state transitions underlying pre- to post-implantation epiblast differentiation.
View Article and Find Full Text PDFEukaryotic gene transcription is regulated at many steps, including RNA polymerase II (Pol II) recruitment, transcription initiation, promoter-proximal Pol II pause release, and transcription termination; however, mechanisms regulating transcription during productive elongation remain poorly understood. Enhancers, which activate gene transcription, themselves undergo Pol II-mediated transcription, but our understanding of enhancer transcription and enhancer RNAs (eRNAs) remains incomplete. Here we show that transcription at intragenic enhancers interferes with and attenuates host gene transcription during productive elongation.
View Article and Find Full Text PDFCell type-specific master transcription factors (TFs) play vital roles in defining cell identity and function. However, the roles ubiquitous factors play in the specification of cell identity remain underappreciated. Here we show that the ubiquitous CCAAT-binding NF-Y complex is required for the maintenance of embryonic stem cell (ESC) identity and is an essential component of the core pluripotency network.
View Article and Find Full Text PDFIdentification of genes associated with specific biological phenotypes is a fundamental step toward understanding the molecular basis underlying development and pathogenesis. Although RNAi-based high-throughput screens are routinely used for this task, false discovery and sensitivity remain a challenge. Here we describe a computational framework for systematic integration of published gene expression data to identify genes defining a phenotype of interest.
View Article and Find Full Text PDF