Enhancers are short non-coding DNA sequences outside of the target promoter regions that can be bound by specific proteins to increase a gene's transcriptional activity, which has a crucial role in the spatiotemporal and quantitative regulation of gene expression. However, enhancers do not have a specific sequence motifs or structures, and their scattered distribution in the genome makes the identification of enhancers from human cell lines particularly challenging. Here we present a novel, stacked multivariate fusion framework called SMFM, which enables a comprehensive identification and analysis of enhancers from regulatory DNA sequences as well as their interpretation. Specifically, to characterize the hierarchical relationships of enhancer sequences, multi-source biological information and dynamic semantic information are fused to represent regulatory DNA enhancer sequences. Then, we implement a deep learning-based sequence network to learn the feature representation of the enhancer sequences comprehensively and to extract the implicit relationships in the dynamic semantic information. Ultimately, an ensemble machine learning classifier is trained based on the refined multi-source features and dynamic implicit relations obtained from the deep learning-based sequence network. Benchmarking experiments demonstrated that SMFM significantly outperforms other existing methods using several evaluation metrics. In addition, an independent test set was used to validate the generalization performance of SMFM by comparing it to other state-of-the-art enhancer identification methods. Moreover, we performed motif analysis based on the contribution scores of different bases of enhancer sequences to the final identification results. Besides, we conducted interpretability analysis of the identified enhancer sequences based on attention weights of EnhancerBERT, a fine-tuned BERT model that provides new insights into exploring the gene semantic information likely to underlie the discovered enhancers in an interpretable manner. Finally, in a human placenta study with 4,562 active distal gene regulatory enhancers, SMFM successfully exposed tissue-related placental development and the differential mechanism, demonstrating the generalizability and stability of our proposed framework.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9836277 | PMC |
http://dx.doi.org/10.1371/journal.pcbi.1010779 | DOI Listing |
Nat Commun
January 2025
MRC Laboratory of Medical Sciences, London, UK.
Gene enhancers often form long-range contacts with promoters, but it remains unclear if the activity of enhancers and their chromosomal contacts are mediated by the same DNA sequences and recruited factors. Here, we study the effects of expression quantitative trait loci (eQTLs) on enhancer activity and promoter contacts in primary monocytes isolated from 34 male individuals. Using eQTL-Capture Hi-C and a Bayesian approach considering both intra- and inter-individual variation, we initially detect 19 eQTLs associated with enhancer-eGene promoter contacts, most of which also associate with enhancer accessibility and activity.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
Department of Genome Sciences, University of Virginia, PO Box 800717, Charlottesville, VA 22908, USA.
Many transcription factors (TFs) have been shown to bind to super-enhancers, forming transcriptional condensates to activate transcription in various cellular systems. However, the genomic and epigenomic determinants of phase-separated transcriptional condensate formation remain poorly understood. Questions regarding which TFs tend to associate with transcriptional condensates and what factors influence their association are largely unanswered.
View Article and Find Full Text PDF-acting regulatory enhancer elements are valuable tools for gaining cell type-specific genetic access. Leveraging large chromatin accessibility atlases, putative enhancer sequences can be identified and deployed in adeno-associated virus (AAV) delivery platforms. However, a significant bottleneck in enhancer AAV discovery is charting their detailed expression patterns , a process that currently requires gold-standard one-by-one testing.
View Article and Find Full Text PDFBone Res
January 2025
Jiangsu Province Key Laboratory of Oral Diseases, Nanjing, Jiangsu Province, China.
Plp1-lineage Schwann cells (SCs) of peripheral nerve play a critical role in vascular remodeling and osteogenic differentiation during the early stage of bone healing, and the abnormal plasticity of SCs would jeopardize the bone regeneration. However, how Plp1-lineage cells respond to injury and initiate the vascularized osteogenesis remains incompletely understood. Here, by employing single-cell transcriptional profiling combined with lineage-specific tracing models, we uncover that Plp1-lineage cells undergoing injury-induced glia-to-MSCs transition contributed to osteogenesis and revascularization in the initial stage of bone injury.
View Article and Find Full Text PDFNat Commun
January 2025
Gene Regulation Laboratory, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK.
Individual enhancers are defined as short genomic regulatory elements, bound by transcription factors, and able to activate cell-specific gene expression at a distance, in an orientation-independent manner. Within mammalian genomes, enhancer-like elements may be found individually or within clusters referred to as locus control regions or super-enhancers (SEs). While these behave similarly to individual enhancers with respect to cell specificity, distribution and distance, their orientation-dependence has not been formally tested.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!