Publications by authors named "Zachary Z Sun"

Deep-learning language models have shown promise in various biotechnological applications, including protein design and engineering. Here we describe ProGen, a language model that can generate protein sequences with a predictable function across large protein families, akin to generating grammatically and semantically correct natural language sentences on diverse topics. The model was trained on 280 million protein sequences from >19,000 families and is augmented with control tags specifying protein properties.

View Article and Find Full Text PDF

We introduce a MATLAB-based simulation toolbox, called txtlsim, for an based Transcription-Translation (TX-TL) system. This toolbox accounts for several cell-free-related phenomena, such as resource loading, consumption and degradation, and in doing so, models the dynamics of TX-TL reactions for the entire duration of solution phase batch-mode experiments. We use a Bayesian parameter inference approach to characterize the reaction rate parameters associated with the core transcription, translation and mRNA degradation mechanics of the toolbox, allowing it to reproduce constitutive mRNA and protein-expression trajectories.

View Article and Find Full Text PDF

Cell-free systems that mimic essential cell functions, such as gene expression, have dramatically expanded in recent years, both in terms of applications and widespread adoption. Here we provide a review of cell-extract methods, with a specific focus on prokaryotic systems. Firstly, we describe the diversity of genetic strains available and their corresponding utility.

View Article and Find Full Text PDF

While complex dynamic biological networks control gene expression in all living organisms, the forward engineering of comparable synthetic networks remains challenging. The current paradigm of characterizing synthetic networks in cells results in lengthy design-build-test cycles, minimal data collection, and poor quantitative characterization. Cell-free systems are appealing alternative environments, but it remains questionable whether biological networks behave similarly in cell-free systems and in cells.

View Article and Find Full Text PDF

A central goal of synthetic biology is to engineer cellular behavior by engineering synthetic gene networks for a variety of biotechnology and medical applications. The process of engineering gene networks often involves an iterative 'design-build-test' cycle, whereby the parts and connections that make up the network are built, characterized and varied until the desired network function is reached. Many advances have been made in the design and build portions of this cycle.

View Article and Find Full Text PDF

RNA regulators are emerging as powerful tools to engineer synthetic genetic networks or rewire existing ones. A potential strength of RNA networks is that they may be able to propagate signals on time scales that are set by the fast degradation rates of RNAs. However, a current bottleneck to verifying this potential is the slow design-build-test cycle of evaluating these networks in vivo.

View Article and Find Full Text PDF

Accelerating the pace of synthetic biology experiments requires new approaches for rapid prototyping of circuits from individual DNA regulatory elements. However, current testing standards require days to weeks due to cloning and in vivo transformation. In this work, we first characterized methods to protect linear DNA strands from exonuclease degradation in an Escherichia coli based transcription-translation cell-free system (TX-TL), as well as mechanisms of degradation.

View Article and Find Full Text PDF

Ideal cell-free expression systems can theoretically emulate an in vivo cellular environment in a controlled in vitro platform. This is useful for expressing proteins and genetic circuits in a controlled manner as well as for providing a prototyping environment for synthetic biology. To achieve the latter goal, cell-free expression systems that preserve endogenous Escherichia coli transcription-translation mechanisms are able to more accurately reflect in vivo cellular dynamics than those based on T7 RNA polymerase transcription.

View Article and Find Full Text PDF

The breadth of genomic diversity found among organisms in nature allows populations to adapt to diverse environments. However, genomic diversity is difficult to generate in the laboratory and new phenotypes do not easily arise on practical timescales. Although in vitro and directed evolution methods have created genetic variants with usefully altered phenotypes, these methods are limited to laborious and serial manipulation of single genes and are not used for parallel and continuous directed evolution of gene networks or genomes.

View Article and Find Full Text PDF