Enhancers are short non-coding DNA sequences outside of the target promoter regions that can be bound by specific proteins to increase a gene's transcriptional activity, which has a crucial role in the spatiotemporal and quantitative regulation of gene expression. However, enhancers do not have a specific sequence motifs or structures, and their scattered distribution in the genome makes the identification of enhancers from human cell lines particularly challenging. Here we present a novel, stacked multivariate fusion framework called SMFM, which enables a comprehensive identification and analysis of enhancers from regulatory DNA sequences as well as their interpretation. Specifically, to characterize the hierarchical relationships of enhancer sequences, multi-source biological information and dynamic semantic information are fused to represent regulatory DNA enhancer sequences. Then, we implement a deep learning-based sequence network to learn the feature representation of the enhancer sequences comprehensively and to extract the implicit relationships in the dynamic semantic information. Ultimately, an ensemble machine learning classifier is trained based on the refined multi-source features and dynamic implicit relations obtained from the deep learning-based sequence network. Benchmarking experiments demonstrated that SMFM significantly outperforms other existing methods using several evaluation metrics. In addition, an independent test set was used to validate the generalization performance of SMFM by comparing it to other state-of-the-art enhancer identification methods. Moreover, we performed motif analysis based on the contribution scores of different bases of enhancer sequences to the final identification results. Besides, we conducted interpretability analysis of the identified enhancer sequences based on attention weights of EnhancerBERT, a fine-tuned BERT model that provides new insights into exploring the gene semantic information likely to underlie the discovered enhancers in an interpretable manner. Finally, in a human placenta study with 4,562 active distal gene regulatory enhancers, SMFM successfully exposed tissue-related placental development and the differential mechanism, demonstrating the generalizability and stability of our proposed framework.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9836277PMC
http://dx.doi.org/10.1371/journal.pcbi.1010779DOI Listing

Publication Analysis

Top Keywords

enhancer sequences
20
stacked multivariate
8
multivariate fusion
8
fusion framework
8
dna sequences
8
regulatory dna
8
dynamic semantic
8
deep learning-based
8
learning-based sequence
8
sequence network
8

Similar Publications

Gene enhancers often form long-range contacts with promoters, but it remains unclear if the activity of enhancers and their chromosomal contacts are mediated by the same DNA sequences and recruited factors. Here, we study the effects of expression quantitative trait loci (eQTLs) on enhancer activity and promoter contacts in primary monocytes isolated from 34 male individuals. Using eQTL-Capture Hi-C and a Bayesian approach considering both intra- and inter-individual variation, we initially detect 19 eQTLs associated with enhancer-eGene promoter contacts, most of which also associate with enhancer accessibility and activity.

View Article and Find Full Text PDF

Many transcription factors (TFs) have been shown to bind to super-enhancers, forming transcriptional condensates to activate transcription in various cellular systems. However, the genomic and epigenomic determinants of phase-separated transcriptional condensate formation remain poorly understood. Questions regarding which TFs tend to associate with transcriptional condensates and what factors influence their association are largely unanswered.

View Article and Find Full Text PDF

-acting regulatory enhancer elements are valuable tools for gaining cell type-specific genetic access. Leveraging large chromatin accessibility atlases, putative enhancer sequences can be identified and deployed in adeno-associated virus (AAV) delivery platforms. However, a significant bottleneck in enhancer AAV discovery is charting their detailed expression patterns , a process that currently requires gold-standard one-by-one testing.

View Article and Find Full Text PDF

Enhancer-driven Shh signaling promotes glia-to-mesenchyme transition during bone repair.

Bone Res

January 2025

Jiangsu Province Key Laboratory of Oral Diseases, Nanjing, Jiangsu Province, China.

Plp1-lineage Schwann cells (SCs) of peripheral nerve play a critical role in vascular remodeling and osteogenic differentiation during the early stage of bone healing, and the abnormal plasticity of SCs would jeopardize the bone regeneration. However, how Plp1-lineage cells respond to injury and initiate the vascularized osteogenesis remains incompletely understood. Here, by employing single-cell transcriptional profiling combined with lineage-specific tracing models, we uncover that Plp1-lineage cells undergoing injury-induced glia-to-MSCs transition contributed to osteogenesis and revascularization in the initial stage of bone injury.

View Article and Find Full Text PDF

The α-globin super-enhancer acts in an orientation-dependent manner.

Nat Commun

January 2025

Gene Regulation Laboratory, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK.

Individual enhancers are defined as short genomic regulatory elements, bound by transcription factors, and able to activate cell-specific gene expression at a distance, in an orientation-independent manner. Within mammalian genomes, enhancer-like elements may be found individually or within clusters referred to as locus control regions or super-enhancers (SEs). While these behave similarly to individual enhancers with respect to cell specificity, distribution and distance, their orientation-dependence has not been formally tested.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!