Spinal muscular atrophy (SMA), the leading genetic cause of infant mortality, arises from survival motor neuron (SMN) protein insufficiency resulting from loss. Approved therapies circumvent endogenous SMN regulation and require repeated dosing or may wane. We describe genome editing of , an insufficient copy of harboring a C6>T mutation, to permanently restore SMN protein levels and rescue SMA phenotypes.
View Article and Find Full Text PDFIn vitro selection queries large combinatorial libraries for sequence-defined polymers with target binding and reaction catalysis activity. While the total sequence space of these libraries can extend beyond 10 sequences, practical considerations limit starting sequences to ≤~10 distinct molecules. Selection-induced sequence convergence and limited sequencing depth further constrain experimentally observable sequence space.
View Article and Find Full Text PDFPrime editing enables search-and-replace genome editing but is limited by low editing efficiency. We present a high-throughput approach, the Peptide Self-Editing sequencing assay (PepSEq), to measure how fusion of 12,000 85-amino acid peptides influences prime editing efficiency. We show that peptide fusion can enhance prime editing, prime-enhancing peptides combine productively, and a top dual peptide-prime editor increases prime editing significantly in multiple cell lines across dozens of target sites.
View Article and Find Full Text PDFNat Chem Biol
November 2021
Directed evolution can generate proteins with tailor-made activities. However, full-length genotypes, their frequencies and fitnesses are difficult to measure for evolving gene-length biomolecules using most high-throughput DNA sequencing methods, as short read lengths can lose mutation linkages in haplotypes. Here we present Evoracle, a machine learning method that accurately reconstructs full-length genotypes (R = 0.
View Article and Find Full Text PDFMutational outcomes following CRISPR-Cas9-nuclease cutting in mammalian cells have recently been shown to be predictable and, in certain cases, skewed toward single genotypes. However, the ability to control these outcomes remains limited, especially for 1-bp insertions, a common and therapeutically relevant class of repair outcomes. Here, through a small molecule screen, we identify the ATM kinase inhibitor KU-60019 as a compound capable of reproducibly increasing the fraction of 1-bp insertions relative to other Cas9 repair outcomes.
View Article and Find Full Text PDFProgrammable C•G-to-G•C base editors (CGBEs) have broad scientific and therapeutic potential, but their editing outcomes have proved difficult to predict and their editing efficiency and product purity are often low. We describe a suite of engineered CGBEs paired with machine learning models to enable efficient, high-purity C•G-to-G•C base editing. We performed a CRISPR interference (CRISPRi) screen targeting DNA repair genes to identify factors that affect C•G-to-G•C editing outcomes and used these insights to develop CGBEs with diverse editing profiles.
View Article and Find Full Text PDFPrime editing (PE) is a versatile genome editing technology, but design of the required guide RNAs is more complex than for standard CRISPR-based nucleases or base editors. Here we describe PrimeDesign, a user-friendly, end-to-end web application and command-line tool for the design of PE experiments. PrimeDesign can be used for single and combination editing applications, as well as genome-wide and saturation mutagenesis screens.
View Article and Find Full Text PDFPLoS Comput Biol
January 2021
Restoring gene function by the induced skipping of deleterious exons has been shown to be effective for treating genetic disorders. However, many of the clinically successful therapies for exon skipping are transient oligonucleotide-based treatments that require frequent dosing. CRISPR-Cas9 based genome editing that causes exon skipping is a promising therapeutic modality that may offer permanent alleviation of genetic disease.
View Article and Find Full Text PDFGene expression is controlled by the collective binding of transcription factors to cis-regulatory regions. Deciphering gene-centered regulatory networks is vital to understanding and controlling gene misexpression in human disease; however, systematic approaches to uncovering regulatory networks have been lacking. Here we present high-throughput interrogation of gene-centered activation networks (HIGAN), a pipeline that employs a suite of multifaceted genomic approaches to connect upstream signaling inputs, trans-acting TFs, and cis-regulatory elements.
View Article and Find Full Text PDFAlthough base editors are widely used to install targeted point mutations, the factors that determine base editing outcomes are not well understood. We characterized sequence-activity relationships of 11 cytosine and adenine base editors (CBEs and ABEs) on 38,538 genomically integrated targets in mammalian cells and used the resulting outcomes to train BE-Hive, a machine learning model that accurately predicts base editing genotypic outcomes (R ≈ 0.9) and efficiency (R ≈ 0.
View Article and Find Full Text PDFThe targeting scope of Streptococcus pyogenes Cas9 (SpCas9) and its engineered variants is largely restricted to protospacer-adjacent motif (PAM) sequences containing G bases. Here we report the evolution of three new SpCas9 variants that collectively recognize NRNH PAMs (where R is A or G and H is A, C or T) using phage-assisted non-continuous evolution, three new phage-assisted continuous evolution strategies for DNA binding and a secondary selection for DNA cleavage. The targeting capabilities of these evolved variants and SpCas9-NG were characterized in HEK293T cells using a library of 11,776 genomically integrated protospacer-sgRNA pairs containing all possible NNNN PAMs.
View Article and Find Full Text PDFIn this Article, a data processing error affected Fig. 3e and Extended Data Table 2; these errors have been corrected online.
View Article and Find Full Text PDFFollowing Cas9 cleavage, DNA repair without a donor template is generally considered stochastic, heterogeneous and impractical beyond gene disruption. Here, we show that template-free Cas9 editing is predictable and capable of precise repair to a predicted genotype, enabling correction of disease-associated mutations in humans. We constructed a library of 2,000 Cas9 guide RNAs paired with DNA target sites and trained inDelphi, a machine learning model that predicts genotypes and frequencies of 1- to 60-base-pair deletions and 1-base-pair insertions with high accuracy (r = 0.
View Article and Find Full Text PDFThe recent breakthroughs in assembling long error-prone reads were based on the overlap-layout-consensus (OLC) approach and did not utilize the strengths of the alternative de Bruijn graph approach to genome assembly. Moreover, these studies often assume that applications of the de Bruijn graph approach are limited to short and accurate reads and that the OLC approach is the only practical paradigm for assembling long error-prone reads. We show how to generalize de Bruijn graphs for assembling long error-prone reads and describe the ABruijn assembler, which combines the de Bruijn graph and the OLC approaches and results in accurate genome reconstructions.
View Article and Find Full Text PDF