The vast majority of disease-associated variants identified in genome-wide association studies map to enhancers, powerful regulatory elements which orchestrate the recruitment of transcriptional complexes to their target genes' promoters to upregulate transcription in a cell type- and timing-dependent manner. These variants have implicated thousands of enhancers in many common genetic diseases, including nearly all cancers. However, the etiology of most of these diseases remains unknown because the regulatory target genes of the vast majority of enhancers are unknown. Thus, identifying the target genes of as many enhancers as possible is crucial for learning how enhancer regulatory activities function and contribute to disease. Based on experimental results curated from scientific publications coupled with machine learning methods, we developed a cell type-specific score predictive of an enhancer targeting a gene. We computed the score genome-wide for every possible cis enhancer-gene pair and validated its predictive ability in four widely used cell lines. Using a pooled final model trained across multiple cell types, all possible gene-enhancer regulatory links in cis (~17 M) were scored and added to the publicly available PEREGRINE database ( www.peregrineproj.org ). These scores provide a quantitative framework for the enhancer-gene regulatory prediction that can be incorporated into downstream statistical analyses.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10070356 | PMC |
http://dx.doi.org/10.1038/s41540-023-00270-z | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!