Publications by authors named "Georg Seelig"

Article Synopsis
  • The goal of gene therapy is to target specific cell types using endogenous microRNA, but the link between microRNA expression and its effects is not well understood.
  • The researchers conducted an experiment measuring the stability of synthetic microRNA-responsive elements in various cell lines, overcoming biases in existing microRNA data.
  • They developed a predictive model based on this data, allowing them to create gene constructs with new response patterns, beneficial for designing microRNA-responsive devices in different contexts.
View Article and Find Full Text PDF

Massively parallel reporter assays (MPRAs) are powerful tools for quantifying the impacts of sequence variation on gene expression. Reading out molecular phenotypes with sequencing enables interrogating the impact of sequence variation beyond genome scale. Machine learning models integrate and codify information learned from MPRAs and enable generalization by predicting sequences outside the training data set.

View Article and Find Full Text PDF

The interplay between transcription factors and chromatin accessibility regulates cell type diversification during vertebrate embryogenesis. To systematically decipher the gene regulatory logic guiding this process, we generated a single-cell multi-omics atlas of RNA expression and chromatin accessibility during early zebrafish embryogenesis. We developed a deep learning model to predict chromatin accessibility based on DNA sequence and found that a small number of transcription factors underlie cell-type-specific chromatin landscapes.

View Article and Find Full Text PDF

The RNA exosome plays critical roles in eukaryotic RNA degradation, but it remains unclear how the exosome specifically recognizes its targets. The PAXT connection is an adaptor that recruits the exosome to polyadenylated RNAs in the nucleus, especially transcripts polyadenylated at intronic poly(A) sites. Here we show that PAXT-mediated RNA degradation is induced by the combination of a 5' splice site and a poly(A) junction, but not by either sequence alone.

View Article and Find Full Text PDF
Article Synopsis
  • Protein-protein interactions (PPIs) are crucial for various cellular functions and can be engineered for use in cell and gene therapies.
  • The study introduces a method called massively parallel PPI measurement by sequencing (MP3-seq), which allows for easy and scalable measurement of PPIs using DNA barcodes to quantify interaction strength.
  • Findings demonstrate that MP3-seq can analyze over 100,000 interactions effectively, and although AI models (like AlphaFold-Multimer) help predict interaction structures, direct experimental measurements are still needed for precise ranking of interaction strengths.
View Article and Find Full Text PDF
Article Synopsis
  • Transcriptional heterogeneity in bacteria affects evolution but is hard to detect; this study uses advanced techniques to analyze it at the single-cell level.
  • It finds that growth state and plasmid presence significantly impact how bacterial subpopulations form and behave.
  • The research reveals that some cells don't express genes for transferring plasmids, potentially easing the burden of carrying plasmids on those cells.*
View Article and Find Full Text PDF

An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity.

View Article and Find Full Text PDF
Article Synopsis
  • mRNA therapeutics are changing the pharmaceutical industry, but optimizing their sequence for better expression remains a challenge.
  • Researchers developed 5' untranslated regions (UTRs) using deep learning to enhance mRNA translation, finding that UTR performance is consistent across different cell types.
  • They created and tested new 5'UTRs with mRNA for gene editing, showing improved efficiency and highlighting the effectiveness of model-based design in mRNA therapies.
View Article and Find Full Text PDF

Microbial split-pool ligation transcriptomics (microSPLiT) is a high-throughput single-cell RNA sequencing method for bacteria. With four combinatorial barcoding rounds, microSPLiT can profile transcriptional states in hundreds of thousands of Gram-negative and Gram-positive bacteria in a single experiment without specialized equipment. As bacterial samples are fixed and permeabilized before barcoding, they can be collected and stored ahead of time.

View Article and Find Full Text PDF

JTE-607 is an anticancer and anti-inflammatory compound and its active form, compound 2, directly binds to and inhibits CPSF73, the endonuclease for the cleavage step in pre-messenger RNA (pre-mRNA) 3' processing. Surprisingly, compound 2-mediated inhibition of pre-mRNA cleavage is sequence specific and the drug sensitivity is predominantly determined by sequences flanking the cleavage site (CS). Using massively parallel in vitro assays, we identified key sequence features that determine drug sensitivity.

View Article and Find Full Text PDF

The 5' UTRs of mRNAs are critical for translation regulation, but their regulatory features are poorly characterized. Here, we report the regulatory landscape of 5' UTRs during early zebrafish embryogenesis using a massively parallel reporter assay of 18,154 sequences coupled to polysome profiling. We found that the 5' UTR is sufficient to confer temporal dynamics to translation initiation, and identified 86 motifs enriched in 5' UTRs with distinct ribosome recruitment capabilities.

View Article and Find Full Text PDF

DNA has emerged as an attractive medium for archival data storage due to its durability and high information density. Scalable parallel random access to information is a desirable property of any storage system. For DNA-based storage systems, however, this still needs to be robustly established.

View Article and Find Full Text PDF

JTE-607 is a small molecule compound with anti-inflammation and anti-cancer activities. Upon entering the cell, it is hydrolyzed to Compound 2, which directly binds to and inhibits CPSF73, the endonuclease for the cleavage step in pre-mRNA 3' processing. Although CPSF73 is universally required for mRNA 3' end formation, we have unexpectedly found that Compound 2- mediated inhibition of pre-mRNA 3' processing is sequence-specific and that the sequences flanking the cleavage site (CS) are a major determinant for drug sensitivity.

View Article and Find Full Text PDF

Protein-protein interactions (PPIs) regulate many cellular processes, and engineered PPIs have cell and gene therapy applications. Here we introduce massively parallel protein-protein interaction measurement by sequencing (MP3-seq), an easy-to-use and highly scalable yeast-two-hybrid approach for measuring PPIs. In MP3-seq, DNA barcodes are associated with specific protein pairs, and barcode enrichment can be read by sequencing to provide a direct measure of interaction strength.

View Article and Find Full Text PDF

Background: 3'-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging.

View Article and Find Full Text PDF

DNA has emerged as a powerful substrate for programming information processing machines at the nanoscale. Among the DNA computing primitives used today, DNA strand displacement (DSD) is arguably the most popular, with DSD-based circuit applications ranging from disease diagnostics to molecular artificial neural networks. The outputs of DSD circuits are generally read using fluorescence spectroscopy.

View Article and Find Full Text PDF

Sequence-based neural networks can learn to make accurate predictions from large biological datasets, but model interpretation remains challenging. Many existing feature attribution methods are optimized for continuous rather than discrete input patterns and assess individual feature importance in isolation, making them ill-suited for interpreting non-linear interactions in molecular sequences. Building on work in computer vision and natural language processing, we developed an approach based on deep learning - Scrambler networks - wherein the most salient sequence positions are identified with learned input masks.

View Article and Find Full Text PDF

Division of labor between cells is ubiquitous in biology but the use of multicellular consortia for engineering applications is only beginning to be explored. A significant advantage of multicellular circuits is their potential to be modular with respect to composition but this claim has not yet been extensively tested using experiments and quantitative modeling. Here, we construct a library of 24 yeast strains capable of sending, receiving or responding to three molecular signals, characterize them experimentally and build quantitative models of their input-output relationships.

View Article and Find Full Text PDF

Over just the last 2 years, mRNA therapeutics and vaccines have undergone a rapid transition from an intriguing concept to real-world impact. However, whereas some aspects of mRNA therapeutics, such as the use of chemical modifications to increase stability and reduce immunogenicity, have been extensively optimized for over two decades, other aspects, particularly the selection and design of the noncoding leader and trailer sequences which control translation efficiency and stability, have received comparably less attention. In practice, such 5' and 3' untranslated regions (UTRs) are often borrowed from highly expressed human genes with few or no modifications, as in the case for the Pfizer/BioNTech Covid vaccine.

View Article and Find Full Text PDF

Motivation: Single-cell RNA sequencing (scRNA-seq) is widely used for analyzing gene expression in multi-cellular systems and provides unprecedented access to cellular heterogeneity. scRNA-seq experiments aim to identify and quantify all cell types present in a sample. Measured single-cell transcriptomes are grouped by similarity and the resulting clusters are mapped to cell types based on cluster-specific gene expression patterns.

View Article and Find Full Text PDF

Background: Optimization of DNA and protein sequences based on Machine Learning models is becoming a powerful tool for molecular design. Activation maximization offers a simple design strategy for differentiable models: one-hot coded sequences are first approximated by a continuous representation, which is then iteratively optimized with respect to the predictor oracle by gradient ascent. While elegant, the current version of the method suffers from vanishing gradients and may cause predictor pathologies leading to poor convergence.

View Article and Find Full Text PDF

As global demand for digital storage capacity grows, storage technologies based on synthetic DNA have emerged as a dense and durable alternative to traditional media. Existing approaches leverage robust error correcting codes and precise molecular mechanisms to reliably retrieve specific files from large databases. Typically, files are retrieved using a pre-specified key, analogous to a filename.

View Article and Find Full Text PDF

We performed a comprehensive analysis of the transcriptional changes occurring during human induced pluripotent stem cell (hiPSC) differentiation to cardiomyocytes. Using single cell RNA-seq, we sequenced > 20,000 single cells from 55 independent samples representing two differentiation protocols and multiple hiPSC lines. Samples included experimental replicates ranging from undifferentiated hiPSCs to mixed populations of cells at D90 post-differentiation.

View Article and Find Full Text PDF

The human neonatal cerebellum is one-fourth of its adult size yet contains the blueprint required to integrate environmental cues with developing motor, cognitive and emotional skills into adulthood. Although mature cerebellar neuroanatomy is well studied, understanding of its developmental origins is limited. In this study, we systematically mapped the molecular, cellular and spatial composition of human fetal cerebellum by combining laser capture microscopy and SPLiT-seq single-nucleus transcriptomics.

View Article and Find Full Text PDF

Single-cell RNA sequencing (scRNA-seq) has become an essential tool for characterizing gene expression in eukaryotes, but current methods are incompatible with bacteria. Here, we introduce microSPLiT (microbial split-pool ligation transcriptomics), a high-throughput scRNA-seq method for Gram-negative and Gram-positive bacteria that can resolve heterogeneous transcriptional states. We applied microSPLiT to >25,000 cells sampled at different growth stages, creating an atlas of changes in metabolism and lifestyle.

View Article and Find Full Text PDF