Single-cell ATAC-seq (scATAC-seq) is a recently developed approach that provides means to investigate open chromatin at single cell level, to assess epigenetic regulation and transcription factors binding landscapes. The sparsity of the scATAC-seq data calls for imputation. Similarly, preprocessing (filtering) may be required to reduce computational load due to the large number of open regions.
View Article and Find Full Text PDFWe present a major update of the HOCOMOCO collection that provides DNA binding specificity patterns of 949 human transcription factors and 720 mouse orthologs. To make this release, we performed motif discovery in peak sets that originated from 14 183 ChIP-Seq experiments and reads from 2554 HT-SELEX experiments yielding more than 400 thousand candidate motifs. The candidate motifs were annotated according to their similarity to known motifs and the hierarchy of DNA-binding domains of the respective transcription factors.
View Article and Find Full Text PDFDifferential methylation (DM) is actively recruited in different types of fundamental and translational studies. Currently, microarray- and NGS-based approaches for methylation analysis are the most widely used with multiple statistical models designed to extract differential methylation signatures. The benchmarking of DM models is challenging due to the absence of gold standard data.
View Article and Find Full Text PDFSingle-cell RNA-seq data contains a lot of dropouts hampering downstream analyses due to the low number and inefficient capture of mRNAs in individual cells. Here, we present Epi-Impute, a computational method for dropout imputation by reconciling expression and epigenomic data. Epi-Impute leverages single-cell ATAC-seq data as an additional source of information about gene activity to reduce the number of dropouts.
View Article and Find Full Text PDFWe present an update of EpiFactors, a manually curated database providing information about epigenetic regulators, their complexes, targets, and products which is openly accessible at http://epifactors.autosome.org.
View Article and Find Full Text PDFAcute myeloid leukemia (AML) is a rapidly progressing heterogeneous disease with a high mortality rate, which is characterized by hyperproliferation of atypical immature myeloid cells. The number of AML patients is expected to increase in the near future, due to the old-age-associated nature of AML and increased longevity in the human population. RUNX1 and CEBPA, key transcription factors (TFs) of hematopoiesis, are frequently and independently mutated in AML.
View Article and Find Full Text PDFLong non-coding RNAs (lncRNAs) play an important role in genome regulation. Specifically, many lncRNAs interact with chromatin, recruit epigenetic complexes and in this way affect large-scale gene expression programs. However, the experimental data about lncRNA-chromatin interactions is still limited.
View Article and Find Full Text PDFAcute myeloid leukemia (AML) is a hematopoietic malignancy characterized by genetic and epigenetic aberrations that alter the differentiation capacity of myeloid progenitor cells. The transcription factor is frequently mutated in AML patients leading to an increase in DNA methylation in many genomic locations. Previously, it has been shown that (extra coding CEBP ) - a lncRNA transcribed in the same direction as gene - regulates DNA methylation of promoter in Here, we hypothesize that could participate in the regulation of DNA methylation in .
View Article and Find Full Text PDFEmerging studies demonstrate the ability of microRNAs (miRNAs) to activate genes via different mechanisms. Specifically, miRNAs may trigger an enhancer promoting chromatin remodelling in the enhancer region, thus activating the enhancer and its target genes. Here we present MIREyA, a pipeline developed to predict such miRNA-gene-enhancer trios based on an expression dataset which obviates the need to write custom scripts.
View Article and Find Full Text PDFMany human genes are transcribed from both strands and produce sense-antisense gene pairs. Sense-antisense (SAS) chimeric transcripts are produced upon the coalescing of exons/introns from both sense and antisense transcripts of the same gene. SAS chimera was first reported in prostate cancer cells.
View Article and Find Full Text PDFImmunomodulation strategies are crucial for several biomedical applications. However, the immune system is highly heterogeneous and its functional responses to infections remains elusive. Indeed, the characterization of immune response particularities to different pathogens is needed to identify immunomodulatory candidates.
View Article and Find Full Text PDFMammalian genomes encode tens of thousands of noncoding RNAs. Most noncoding transcripts exhibit nuclear localization and several have been shown to play a role in the regulation of gene expression and chromatin remodeling. To investigate the function of such RNAs, methods to massively map the genomic interacting sites of multiple transcripts have been developed; however, these methods have some limitations.
View Article and Find Full Text PDFLong noncoding RNAs (lncRNAs) play a key role in many cellular processes including chromatin regulation. To modify chromatin, lncRNAs often interact with DNA in a sequence-specific manner forming RNA:DNA triple helices. Computational tools for triple helix search do not always provide genome-wide predictions of sufficient quality.
View Article and Find Full Text PDF(Mtb) can subvert the host defense by skewing macrophage activation toward a less microbicidal alternative activated state to avoid classical effector killing functions. Investigating the molecular basis of this evasion mechanism could uncover potential candidates for host directed therapy against tuberculosis (TB). A limited number of miRNAs have recently been shown to regulate host-mycobacterial interactions.
View Article and Find Full Text PDFBackground: DNA methylation is involved in the regulation of gene expression. Although bisulfite-sequencing based methods profile DNA methylation at a single CpG resolution, methylation levels are usually averaged over genomic regions in the downstream bioinformatic analysis.
Results: We demonstrate that on the genome level a single CpG methylation can serve as a more accurate predictor of gene expression than an average promoter / gene body methylation.
The presence of H3K27me3 has been demonstrated to correlate with the CpG content. In this work, we tested whether H3K27ac has similar sequence preferences. We performed a translocation of DNA sequences with various properties into a beta-globin locus to control for the local chromatin environment.
View Article and Find Full Text PDFThe genomes of mammalian species are pervasively transcribed producing as many noncoding as protein-coding RNAs. There is a growing body of evidence supporting their functional role. Long noncoding RNA (lncRNA) can bind both nucleic acids and proteins through several mechanisms.
View Article and Find Full Text PDFIdentifying transcription factor (TF) binding sites (TFBSs) is important in the computational inference of gene regulation. Widely used computational methods of TFBS prediction based on position weight matrices (PWMs) usually have high false positive rates. Moreover, computational studies of transcription regulation in eukaryotes frequently require numerous PWM models of TFBSs due to a large number of TFs involved.
View Article and Find Full Text PDFBackground: The association of exposure to endocrine disrupting chemicals in the peripubertal period with subsequent sperm DNA methylation is unknown.
Objective: We examined the association of peripubertal serum 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) concentrations with whole-genome bisulfite sequencing (WGBS) of sperm collected in young adulthood.
Methods: The Russian Children's Study is a prospective cohort of 516 boys who were enrolled at 8-9 years of age and provided semen samples at 18-19 years of age.
Many long noncoding RNAs are bound to the chromatin and some of these interactions are mediated by triple helices. It is usually assumed that a transcript can form triplexes with a distinct set of genomic loci also known as triplex target sites (TTSs). Here we performed computational analyses of the TTSs that have been experimentally identified for particular RNAs.
View Article and Find Full Text PDFWe present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database.
View Article and Find Full Text PDFThe three-spined stickleback (Gasterosteus aculeatus) represents a convenient model to study microevolution-adaptation to a freshwater environment. Although genetic adaptations to freshwater environments are well-studied, epigenetic adaptations have attracted little attention. In this work, we investigated the role of DNA methylation in the adaptation of the marine stickleback population to freshwater conditions.
View Article and Find Full Text PDFMicroRNAs (miRNAs) are short non-coding RNAs with key roles in cellular regulation. As part of the fifth edition of the Functional Annotation of Mammalian Genome (FANTOM5) project, we created an integrated expression atlas of miRNAs and their promoters by deep-sequencing 492 short RNA (sRNA) libraries, with matching Cap Analysis Gene Expression (CAGE) data, from 396 human and 47 mouse RNA samples. Promoters were identified for 1,357 human and 804 mouse miRNAs and showed strong sequence conservation between species.
View Article and Find Full Text PDF