Publications by authors named "Tomas Rube"

Short linear peptide motifs play important roles in cell signaling. They can act as modification sites for enzymes and as recognition sites for peptide binding domains. SH2 domains bind specifically to tyrosine-phosphorylated proteins, with the affinity of the interaction depending strongly on the flanking sequence.

View Article and Find Full Text PDF

Background: Cas12a (formerly known as Cpf1), the class II type V CRISPR nuclease, has been widely used for genome editing in mammalian cells and plants due to its distinct characteristics from Cas9. Despite being one of the most robust Cas12a nucleases, LbCas12a in general is less efficient than SpCas9 for genome editing in human cells, animals, and plants.

Results: To improve the editing efficiency of LbCas12a, we conduct saturation mutagenesis in E.

View Article and Find Full Text PDF

In eukaryotes, members of transcription factor families often exhibit similar DNA binding properties in vitro, yet orchestrate paralog-specific gene regulatory networks in vivo. The serially homologous first (T1) and third (T3) thoracic legs of Drosophila, which are specified by the Hox proteins Scr and Ubx, respectively, offer a unique opportunity to address this paradox in vivo. Genome-wide analyses using epitope-tagged alleles of both Hox loci in the T1 and T3 leg imaginal discs, the precursors to the adult legs and ventral body regions, show that ~8% of Hox binding is paralog-specific.

View Article and Find Full Text PDF

Protein-ligand interactions are increasingly profiled at high throughput using affinity selection and massively parallel sequencing. However, these assays do not provide the biophysical parameters that most rigorously quantify molecular interactions. Here we describe a flexible machine learning method, called ProBound, that accurately defines sequence recognition in terms of equilibrium binding constants or kinetic rates.

View Article and Find Full Text PDF

Though AsCas12a fills a crucial gap in the current genome editing toolbox, it exhibits relatively poor editing efficiency, restricting its overall utility. Here we isolate an engineered variant, "AsCas12a Ultra", that increased editing efficiency to nearly 100% at all sites examined in HSPCs, iPSCs, T cells, and NK cells. We show that AsCas12a Ultra maintains high on-target specificity thereby mitigating the risk for off-target editing and making it ideal for complex therapeutic genome editing applications.

View Article and Find Full Text PDF

Mechanical deformations of DNA such as bending are ubiquitous and have been implicated in diverse cellular functions. However, the lack of high-throughput tools to measure the mechanical properties of DNA has limited our understanding of how DNA mechanics influence chromatin transactions across the genome. Here we develop 'loop-seq'-a high-throughput assay to measure the propensity for DNA looping-and determine the intrinsic cyclizabilities of 270,806 50-base-pair DNA fragments that span Saccharomyces cerevisiae chromosome V, other genomic regions, and random sequences.

View Article and Find Full Text PDF

CRISPR RNA-guided endonucleases (RGEs) cut or direct activities to specific genomic loci, yet each has off-target activities that are often unpredictable. We developed a pair of simple in vitro assays to systematically measure the DNA-binding specificity (Spec-seq), catalytic activity specificity (SEAM-seq) and cleavage efficiency of RGEs. By separately quantifying binding and cleavage specificity, Spec/SEAM-seq provides detailed mechanistic insight into off-target activity.

View Article and Find Full Text PDF

Eukaryotic transcription factors (TFs) form complexes with various partner proteins to recognize their genomic target sites. Yet, how the DNA sequence determines which TF complex forms at any given site is poorly understood. Here, we demonstrate that high-throughput in vitro DNA binding assays coupled with unbiased computational analysis provide unprecedented insight into how different DNA sequences select distinct compositions and configurations of homeodomain TF complexes.

View Article and Find Full Text PDF

Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites.

View Article and Find Full Text PDF

Transcription factors (TFs) interpret DNA sequence by probing the chemical and structural properties of the nucleotide polymer. DNA shape is thought to enable a parsimonious representation of dependencies between nucleotide positions. Here, we propose a unified mathematical representation of the DNA sequence dependence of shape and TF binding, respectively, which simplifies and enhances analysis of shape readout.

View Article and Find Full Text PDF

The DNA-binding interfaces of the androgen (AR) and glucocorticoid (GR) receptors are virtually identical, yet these transcription factors share only about a third of their genomic binding sites and regulate similarly distinct sets of target genes. To address this paradox, we determined the intrinsic specificities of the AR and GR DNA-binding domains using a refined version of SELEX-seq. We developed an algorithm, , that quantifies binding specificity over a large (31-bp) binding site by iteratively fitting a feature-based generalized linear model to SELEX probe counts.

View Article and Find Full Text PDF

G-quadruplex (GQ) is a four-stranded DNA structure that can be formed in guanine-rich sequences. GQ structures have been proposed to regulate diverse biological processes including transcription, replication, translation and telomere maintenance. Recent studies have demonstrated the existence of GQ DNA in live mammalian cells and a significant number of potential GQ forming sequences in the human genome.

View Article and Find Full Text PDF

Methyl-CpG binding protein 2 (MeCP2) is critical for proper brain development and expressed at near-histone levels in neurons, but the mechanism of its genomic localization remains poorly understood. Using high-resolution MeCP2-binding data, we show that DNA sequence features alone can predict binding with 88% accuracy. Integrating MeCP2 binding and DNA methylation in a probabilistic graphical model, we demonstrate that previously reported genome-wide association with methylation is in part due to MeCP2's affinity to GC-rich chromatin, a result replicated using published data.

View Article and Find Full Text PDF

Telomerase (TERT) activation is a fundamental step in tumorigenesis. By maintaining telomere length, telomerase relieves a main barrier on cellular lifespan, enabling limitless proliferation driven by oncogenes. The recently discovered, highly recurrent mutations in the promoter of TERT are found in over 50 cancer types, and are the most common mutation in many cancers.

View Article and Find Full Text PDF

DNA helical twist imposes geometric constraints on the location of histone-DNA interaction sites along nucleosomal DNA. Certain 10.5-bp periodic nucleotides in phase with these geometric constraints have been suggested to facilitate nucleosome positioning.

View Article and Find Full Text PDF

Reactivation of telomerase reverse transcriptase (TERT) expression enables cells to overcome replicative senescence and escape apoptosis, which are fundamental steps in the initiation of human cancer. Multiple cancer types, including up to 83% of glioblastomas (GBMs), harbor highly recurrent TERT promoter mutations of unknown function but specific to two nucleotide positions. We identified the functional consequence of these mutations in GBMs to be recruitment of the multimeric GA-binding protein (GABP) transcription factor specifically to the mutant promoter.

View Article and Find Full Text PDF

Microbiota regulate intestinal physiology by modifying host gene expression along the length of the intestine, but the underlying regulatory mechanisms remain unresolved. Transcriptional specificity occurs through interactions between transcription factors (TFs) and cis-regulatory regions (CRRs) characterized by nucleosome-depleted accessible chromatin. We profiled transcriptome and accessible chromatin landscapes in intestinal epithelial cells (IECs) from mice reared in the presence or absence of microbiota.

View Article and Find Full Text PDF

Statistical positioning, the localization of nucleosomes packed against a fixed barrier, is conjectured to explain the array of well-positioned nucleosomes at the 5' end of genes, but the extent and precise implications of statistical positioning in vivo are unclear. We examine this hypothesis quantitatively and generalize the idea to include moving barriers as well as nucleosomes actively packed against a barrier. Early experiments noted a similarity between the nucleosome profile aligned and averaged across genes and that predicted by statistical positioning; however, we demonstrate that aligning random nucleosomes also generates the same profile, calling the previous interpretation into question.

View Article and Find Full Text PDF