DNAproDB (https://dnaprodb.usc.edu/) is a database, visualization tool, and processing pipeline for analyzing structural features of protein-DNA interactions.
View Article and Find Full Text PDFPredicting protein-DNA binding specificity is a challenging yet essential task for understanding gene regulation. Protein-DNA complexes usually exhibit binding to a selected DNA target site, whereas a protein binds, with varying degrees of binding specificity, to a wide range of DNA sequences. This information is not directly accessible in a single structure.
View Article and Find Full Text PDFRecently, the remarkable growth of available crystal structure data and libraries of commercially available or readily synthesizable molecules have unlocked previously inaccessible regions of chemical space for drug development. Paired with improvements in virtual ligand screening methods, these expanded libraries are having a notable impact on early drug design efforts. Yet screening-based methods still face scalability limits, due to computational constraints and the sheer scale of drug-like space.
View Article and Find Full Text PDFDevelopment of the malaria parasite, Plasmodium falciparum, is regulated by a limited number of sequence-specific transcription factors (TFs). However, the mechanisms by which these TFs recognize genome-wide binding sites is largely unknown. To address TF specificity, we investigated the binding of two TF subsets that either bind CACACA or GTGCAC DNA sequence motifs and further characterized two additional ApiAP2 TFs, PfAP2-G and PfAP2-EXP, which bind unique DNA motifs (GTAC and TGCATGCA).
View Article and Find Full Text PDFSequence-dependent DNA shape plays an important role in understanding protein-DNA binding mechanisms. High-throughput prediction of DNA shape features has become a valuable tool in the field of protein-DNA recognition, transcription factor-DNA binding specificity, and gene regulation. However, our widely used webserver, DNAshape, relies on statistically summarized pentamer query tables to query DNA shape features.
View Article and Find Full Text PDFCircadian clock genes are emerging targets in many types of cancer, but their mechanistic contributions to tumor progression are still largely unknown. This makes it challenging to stratify patient populations and develop corresponding treatments. In this work, we show that in breast cancer, the disrupted expression of circadian genes has the potential to serve as biomarkers.
View Article and Find Full Text PDFNucleic Acids Res
July 2024
Analyzing and visualizing the tertiary structure and complex interactions of RNA is essential for being able to mechanistically decipher their molecular functions in vivo. Secondary structure visualization software can portray many aspects of RNA; however, these layouts are often unable to preserve topological correspondence since they do not consider tertiary interactions between different regions of an RNA molecule. Likewise, quaternary interactions between two or more interacting RNA molecules are not considered in secondary structure visualization tools.
View Article and Find Full Text PDFThe recognition and binding of nucleic acids (NAs) by proteins depends upon complementary chemical, electrostatic and geometric properties of the protein-NA binding interface. Structural models of protein-NA complexes provide insights into these properties but are scarce relative to models of unbound proteins. We present a deep learning approach for predicting protein-NA binding given the apo structure of a protein (PNAbind).
View Article and Find Full Text PDFUnderstanding the mechanisms of protein-DNA binding is critical in comprehending gene regulation. Three-dimensional DNA structure, also described as DNA shape, plays a key role in these mechanisms. In this study, we present a deep learning-based method, Deep DNAshape, that fundamentally changes the current k-mer based high-throughput prediction of DNA shape features by accurately accounting for the influence of extended flanking regions, without the need for extensive molecular simulations or structural biology experiments.
View Article and Find Full Text PDFPredicting specificity in protein-DNA interactions is a challenging yet essential task for understanding gene regulation. Here, we present Deep Predictor of Binding Specificity (DeepPBS), a geometric deep-learning model designed to predict binding specificity across protein families based on protein-DNA structures. The DeepPBS architecture allows investigation of different family-specific recognition patterns.
View Article and Find Full Text PDFRapid advancement in the computational methods of structure-based drug design has led to their widespread adoption as key tools in the early drug development process. Recently, the remarkable growth of available crystal structure data and libraries of commercially available or readily synthesizable molecules have unlocked previously inaccessible regions of chemical space for drug development. Paired with improvements in virtual ligand screening methods, these expanded libraries are having a significant impact on the success of early drug design efforts.
View Article and Find Full Text PDFDNA recognition and targeting by transcription factors (TFs) through specific binding are fundamental in biological processes. Furthermore, the histidine protonation state at the TF-DNA binding interface can significantly influence the binding mechanism of TF-DNA complexes. Nevertheless, the role of histidine in TF-DNA complexes remains underexplored.
View Article and Find Full Text PDFUnderstanding the mechanisms of protein-DNA binding is critical in comprehending gene regulation. Three-dimensional DNA shape plays a key role in these mechanisms. In this study, we present a deep learning-based method, Deep DNAshape, that fundamentally changes the current -mer based high-throughput prediction of DNA shape features by accurately accounting for the influence of extended flanking regions, without the need for extensive molecular simulations or structural biology experiments.
View Article and Find Full Text PDFDevelopmental studies have revealed the importance of the transcription factor Hand2 in cardiac development. Hand2 promotes cardiac progenitor differentiation and epithelial maturation, while repressing other tissue types. The mechanisms underlying the promotion of cardiac fates are far better understood than those underlying the repression of alternative fates.
View Article and Find Full Text PDFCRISPR-Cas9 has been adapted as a readily programmable genome manipulation agent, and continuing technological advances rely on an in-depth mechanistic understanding of Cas9 target discrimination. Cas9 interrogates a target by unwinding the DNA duplex to form an R-loop, where the RNA guide hybridizes with one of the DNA strands. It has been shown that RNA guides shorter than the normal length of 20-nucleotide (-nt) support Cas9 cleavage activity by enabling partial unwinding beyond the RNA/DNA hybrid.
View Article and Find Full Text PDFQuantifying the nucleotide preferences of DNA binding proteins is essential to understanding how transcription factors (TFs) interact with their targets in the genome. High-throughput in vitro binding assays have been used to identify the inherent DNA binding preferences of TFs in a controlled environment isolated from confounding factors such as genome accessibility, DNA methylation, and TF binding cooperativity. Unfortunately, many of the most common approaches for measuring binding preferences are not sensitive enough for the study of moderate-to-low affinity binding sites, and are unable to detect small-scale differences between closely related homologs.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
January 2023
DNA-binding proteins play important roles in various cellular processes, but the mechanisms by which proteins recognize genomic target sites remain incompletely understood. Functional groups at the edges of the base pairs (bp) exposed in the DNA grooves represent physicochemical signatures. As these signatures enable proteins to form specific contacts between protein residues and bp, their study can provide mechanistic insights into protein-DNA binding.
View Article and Find Full Text PDFSummary: Several high-throughput protein-DNA binding methods currently available produce highly reproducible measurements of binding affinity at the level of the k-mer. However, understanding where a k-mer is positioned along a binding site sequence depends on alignment. Here, we present Top-Down Crawl (TDC), an ultra-rapid tool designed for the alignment of k-mer level data in a rank-dependent and position weight matrix (PWM)-independent manner.
View Article and Find Full Text PDFThe fundamental molecular determinants by which ATP-dependent chromatin remodelers organize nucleosomes across eukaryotic genomes remain largely elusive. Here, chromatin reconstitutions on physiological, whole-genome templates reveal how remodelers read and translate genomic information into nucleosome positions. Using the yeast genome and the multi-subunit INO80 remodeler as a paradigm, we identify DNA shape/mechanics encoded signature motifs as sufficient for nucleosome positioning and distinct from known DNA sequence preferences of histones.
View Article and Find Full Text PDFThe Origin Recognition Complex (ORC) is an evolutionarily conserved six-subunit protein complex that binds specific sites at many locations to coordinately replicate the entire eukaryote genome. Though highly conserved in structure, ORC's selectivity for replication origins has diverged tremendously between yeasts and humans to adapt to vastly different life cycles. In this work, we demonstrate that the selectivity determinant of ORC for DNA binding lies in a 19-amino acid insertion helix in the Orc4 subunit, which is present in yeast but absent in human.
View Article and Find Full Text PDFDevelopmental enhancers control the expression of genes prefiguring morphological patterns. The activity of an enhancer varies among cells of a tissue, but collectively, expression levels in individual cells constitute a spatial pattern of gene expression. How the spatial and quantitative regulatory information is encoded in an enhancer sequence is elusive.
View Article and Find Full Text PDFWe demonstrate here that the α subunit C-terminal domain of RNA polymerase (αCTD) recognizes the upstream promoter (UP) DNA element via its characteristic minor groove shape and electrostatic potential. In two compositionally distinct crystallized assemblies, a pair of αCTD subunits bind in tandem to the UP element consensus A-tract that is 6 bp in length (A-tract), each with their arginine 265 guanidinium group inserted into the minor groove. The A-tract minor groove is significantly narrowed in these crystal structures, as well as in computationally predicted structures of free and bound DNA duplexes derived by Monte Carlo and molecular dynamics simulations, respectively.
View Article and Find Full Text PDFMyocyte enhancer factor-2B (MEF2B) has the unique capability of binding to its DNA target sites with a degenerate motif, while still functioning as a gene-specific transcriptional regulator. Identifying its DNA targets is crucial given regulatory roles exerted by members of the MEF2 family and MEF2B's involvement in B-cell lymphoma. Analyzing structural data and SELEX-seq experimental results, we deduced the DNA sequence and shape determinants of MEF2B target sites on a high-throughput basis in vitro for wild-type and mutant proteins.
View Article and Find Full Text PDF