Discovering DNA regulatory sequence motifs and their relative positions is vital to understanding the mechanisms of gene expression regulation. Although deep convolutional neural networks (CNNs) have achieved great success in predicting cis-regulatory elements, the discovery of motifs and their combinatorial patterns from these CNN models has remained difficult. We show that the main difficulty is due to the problem of multifaceted neurons which respond to multiple types of sequence patterns. Since existing interpretation methods were mainly designed to visualize the class of sequences that can activate the neuron, the resulting visualization will correspond to a mixture of patterns. Such a mixture is usually difficult to interpret without resolving the mixed patterns. We propose the NeuronMotif algorithm to interpret such neurons. Given any convolutional neuron (CN) in the network, NeuronMotif first generates a large sample of sequences capable of activating the CN, which typically consists of a mixture of patterns. Then, the sequences are "demixed" in a layer-wise manner by backward clustering of the feature maps of the involved convolutional layers. NeuronMotif can output the sequence motifs, and the syntax rules governing their combinations are depicted by position weight matrices organized in tree structures. Compared to existing methods, the motifs found by NeuronMotif have more matches to known motifs in the JASPAR database. The higher-order patterns uncovered for deep CNs are supported by the literature and ATAC-seq footprinting. Overall, NeuronMotif enables the deciphering of cis-regulatory codes from deep CNs and enhances the utility of CNN in genome interpretation.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10104575 | PMC |
http://dx.doi.org/10.1073/pnas.2216698120 | DOI Listing |
Recent studies have demonstrated that the production of bidirectional enhancer-derived transcripts (eRNAs) is a characteristic of an active Cis-regulatory element (CRE). Higher levels of eRNA synthesis correlate with the activation of histone modifications, a potentially valuable tool for deciphering the complexity of the gene regulatory network. To understand the changes of CREs during gonadal development in mice, we collected gonadal WT1-positive cells from the piggyBac-Wt1-mCherry-2A-EGFP (PBWt1-RG) reporter strain at E13.
View Article and Find Full Text PDFNat Genet
January 2025
Calico Life Sciences LLC, South San Francisco, CA, USA.
Sequence-based machine-learning models trained on genomics data improve genetic variant interpretation by providing functional predictions describing their impact on the cis-regulatory code. However, current tools do not predict RNA-seq expression profiles because of modeling challenges. Here, we introduce Borzoi, a model that learns to predict cell-type-specific and tissue-specific RNA-seq coverage from DNA sequence.
View Article and Find Full Text PDFRecent massively-parallel approaches to decipher gene regulatory circuits have focused on the discovery of either -regulatory elements (CREs) or -acting factors. Here, we develop a scalable approach that pairs - and -regulatory CRISPR screens to systematically dissect how the key immune checkpoint is regulated. In human pancreatic ductal adenocarcinoma (PDAC) cells, we tile the locus using ∼25,000 CRISPR perturbations in constitutive and IFNγ-stimulated conditions.
View Article and Find Full Text PDFCancer Discov
November 2024
Institute of Systems Biomedicine, Beijing Key Laboratory of Tumor Systems Biology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China.
Significant efforts have been made to identify and validate oncoproteins and ncRNAs as therapeutic targets for cancer therapy; however, emerging observations suggest that noncoding cis-regulatory elements, which orchestrate the 3D organization of the genome and thus the transcriptional landscape, are potential therapeutic targets as well. In this commentary, we envisage that further efforts to decipher the noncoding cis-regulatory code and performing systematic surveys of functional noncoding cis-regulatory elements and recurrent 3D genome alterations in both cancerous and nonmalignant cells within tumor tissues will pave the way to the development of novel therapeutic strategies.
View Article and Find Full Text PDFNucleic Acids Res
November 2024
Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Buxin Road NO. 97, Dapeng District, Shenzhen 518124, China.
Enhancers play a critical role in dynamically regulating spatial-temporal gene expression and establishing cell identity, underscoring the significance of designing them with specific properties for applications in biosynthetic engineering and gene therapy. Despite numerous high-throughput methods facilitating genome-wide enhancer identification, deciphering the sequence determinants of their activity remains challenging. Here, we present the DREAM (DNA cis-Regulatory Elements with controllable Activity design platforM) framework, a novel deep learning-based approach for synthetic enhancer design.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!