Background: Transcription factors are key proteins in the regulation of gene transcription. An important step in this process is the opening of chromatin in order to make genomic regions available for transcription. Data on DNase I hypersensitivity has previously been used to label a subset of transcription factors as Pioneers, Settlers and Migrants to describe their potential role in this process. These labels represent an interesting hypothesis on gene regulation and possibly a useful approach for data analysis, and therefore we wanted to expand the set of labeled transcription factors to include as many known factors as possible. We have used a well-annotated dataset of 1175 transcription factors as input to supervised machine learning methods, using the subset with previously assigned labels as training set. We then used the final classifier to label the additional transcription factors according to their potential role as Pioneers, Settlers and Migrants. The full set of labeled transcription factors was used to investigate associated properties and functions of each class, including an analysis of interaction data for transcription factors based on DNA co-binding and protein-protein interactions. We also used the assigned labels to analyze a previously published set of gene lists associated with a time course experiment on cell differentiation.
Results: The analysis showed that the classification of transcription factors with respect to their potential role in chromatin opening largely was determined by how they bind to DNA. Each subclass of transcription factors was enriched for properties that seemed to characterize the subclass relative to its role in gene regulation, with very general functions for Pioneers, whereas Migrants to a larger extent were associated with specific processes. Further analysis showed that the expanded classification is a useful resource for analyzing other datasets on transcription factors with respect to their potential role in gene regulation. The analysis of transcription factor interaction data showed complementary differences between the subclasses, where transcription factors labeled as Pioneers often interact with other transcription factors through DNA co-binding, whereas Migrants to a larger extent use protein-protein interactions. The analysis of time course data on cell differentiation indicated a shift in the regulatory program associated with Pioneer-like transcription factors during differentiation.
Conclusions: The expanded classification is an interesting resource for analyzing data on gene regulation, as illustrated here on transcription factor interaction data and data from a time course experiment. The potential regulatory function of transcription factors seems largely to be determined by how they bind DNA, but is also influenced by how they interact with each other through cooperativity and protein-protein interactions.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5109715 | PMC |
http://dx.doi.org/10.1186/s12859-016-1349-2 | DOI Listing |
Proc Natl Acad Sci U S A
January 2025
State Key Laboratory of Wheat Improvement, College of Life Science, Shandong Agricultural University, Tai'an 271018, China.
In many plants, the asymmetric division of the zygote sets up the apical-basal body axis. In the cress , the zygote coexpresses regulators of the apical and basal embryo lineages, the transcription factors WOX2 and WRKY2/WOX8, respectively. WRKY2/WOX8 activity promotes nuclear migration, cellular polarity, and mitotic asymmetry of the zygote, which are hallmarks of axis formation in many plant species.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
January 2025
Institute of Science and Technology Austria, AT-3400 Klosterneuburg, Austria.
Biophysical constraints limit the specificity with which transcription factors (TFs) can target regulatory DNA. While individual nontarget binding events may be low affinity, the sheer number of such interactions could present a challenge for gene regulation by degrading its precision or possibly leading to an erroneous induction state. Chromatin can prevent nontarget binding by rendering DNA physically inaccessible to TFs, at the cost of energy-consuming remodeling orchestrated by pioneer factors (PFs).
View Article and Find Full Text PDFProc Natl Acad Sci U S A
January 2025
Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210.
The homo-dodecameric ring-shaped RNA binding attenuation protein (TRAP) from binds up to twelve tryptophan ligands (Trp) and becomes activated to bind a specific sequence in the 5' leader region of the operon mRNA, thereby downregulating biosynthesis of Trp. Thermodynamic measurements of Trp binding have revealed a range of cooperative behavior for different TRAP variants, even if the averaged apparent affinities for Trp have been found to be similar. Proximity between the ligand binding sites, and the ligand-coupled disorder-to-order transition has implicated nearest-neighbor interactions in cooperativity.
View Article and Find Full Text PDFPLoS Comput Biol
January 2025
Department of Computer Science, Colorado State University, Fort Collins, Colorado, United States of America.
Complex deep learning models trained on very large datasets have become key enabling tools for current research in natural language processing and computer vision. By providing pre-trained models that can be fine-tuned for specific applications, they enable researchers to create accurate models with minimal effort and computational resources. Large scale genomics deep learning models come in two flavors: the first are large language models of DNA sequences trained in a self-supervised fashion, similar to the corresponding natural language models; the second are supervised learning models that leverage large scale genomics datasets from ENCODE and other sources.
View Article and Find Full Text PDFDifferentiation of antigen-activated B cells into pro-proliferative germinal center (GC) B cells depends on the activity of the transcription factors MYC and BCL6, and the epigenetic writers DOT1L and EZH2. GCB-like Diffuse Large B Cell Lymphomas (GCB-DLBCLs) arise from GCB cells and closely resemble their cell of origin. Given the dependency of GCB cells on DOT1L and EZH2, we investigated the role of these epigenetic regulators in GCB-DLBCLs and observed that GCB-DLBCLs synergistically depend on the combined activity of DOT1L and EZH2.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!