Molecular structure prediction and homology detection offer promising paths to discovering protein function and evolutionary relationships. However, current approaches lack statistical reliability assurances, limiting their practical utility for selecting proteins for further experimental and in-silico characterization. To address this challenge, we introduce a statistically principled approach to protein search leveraging principles from conformal prediction, offering a framework that ensures statistical guarantees with user-specified risk and provides calibrated probabilities (rather than raw ML scores) for any protein search model.
View Article and Find Full Text PDFRNA-guided endonucleases are involved in processes ranging from adaptive immunity to site-specific transposition and have revolutionized genome editing. CRISPR-Cas9, -Cas12 and related proteins use guide RNAs to recognize ∼20-nucleotide target sites within genomic DNA by mechanisms that are not yet fully understood. We used structural and biochemical methods to assess early steps in DNA recognition by Cas12a protein-guide RNA complexes.
View Article and Find Full Text PDFStructured RNA lies at the heart of many central biological processes, from gene expression to catalysis. RNA structure prediction is not yet possible due to a lack of high-quality reference data associated with organismal phenotypes that could inform RNA function. We present GARNET (Gtdb Acquired RNa with Environmental Temperatures), a new database for RNA structural and functional analysis anchored to the Genome Taxonomy Database (GTDB).
View Article and Find Full Text PDFAll lineages of SARS-CoV-2, the coronavirus responsible for the COVID-19 pandemic, contain mutations between amino acids 199 and 205 in the nucleocapsid (N) protein that are associated with increased infectivity. The effects of these mutations have been difficult to determine because N protein contributes to both viral replication and viral particle assembly during infection. Here, we used single-cycle infection and virus-like particle assays to show that N protein phosphorylation has opposing effects on viral assembly and genome replication.
View Article and Find Full Text PDFNon-coding mutations in the TERT promoter (TERTp), typically at one of two bases -124 and -146 bp upstream of the start codon, are among the most prevalent driver mutations in human cancer. Several additional recurrent TERTp mutations have been reported but their functions and origins remain largely unexplained. Here, we show that atypical TERTp mutations arise secondary to canonical TERTp mutations in a two-step process.
View Article and Find Full Text PDFEffective genome editing requires a sufficient dose of CRISPR-Cas9 ribonucleoproteins (RNPs) to enter the target cell while minimizing immune responses, off-target editing and cytotoxicity. Clinical use of Cas9 RNPs currently entails electroporation into cells , but no systematic comparison of this method to packaged RNP delivery has been made. Here we compared two delivery strategies, electroporation and enveloped delivery vehicles (EDVs), to investigate the Cas9 dosage requirements for genome editing.
View Article and Find Full Text PDFHachiman is a broad-spectrum antiphage defense system of unknown function. We show here that Hachiman is a heterodimeric nuclease-helicase complex, HamAB. HamA, previously a protein of unknown function, is the effector nuclease.
View Article and Find Full Text PDFAnimal and bacterial cells sense and defend against viral infections using evolutionarily conserved antiviral signaling pathways. Here, we show that viruses overcome host signaling using mechanisms of immune evasion that are directly shared across the eukaryotic and prokaryotic kingdoms of life. Structures of animal poxvirus proteins that inhibit host cGAS-STING signaling demonstrate architectural and catalytic active-site homology shared with bacteriophage Acb1 proteins, which inactivate CBASS anti-phage defense.
View Article and Find Full Text PDFThe rapid evolution of viruses generates proteins that are essential for infectivity and replication but with unknown functions, due to extreme sequence divergence. Here, using a database of 67,715 newly predicted protein structures from 4,463 eukaryotic viral species, we found that 62% of viral proteins are structurally distinct and lack homologues in the AlphaFold database. Among the remaining 38% of viral proteins, many have non-viral structural analogues that revealed surprising similarities between human pathogens and their eukaryotic hosts.
View Article and Find Full Text PDFThe envelope (E) protein of SARS-CoV-2 is the smallest of the three structural membrane proteins of the virus. E mediates budding of the progeny virus in the endoplasmic reticulum Golgi intermediate compartment of the cell. It also conducts ions, and this channel activity is associated with the pathogenicity of SARS-CoV-2.
View Article and Find Full Text PDFRNA-guided endonucleases are involved in processes ranging from adaptive immunity to site-specific transposition and have revolutionized genome editing. CRISPR-Cas9, -Cas12 and related proteins use guide RNAs to recognize ~20-nucleotide target sites within genomic DNA by mechanisms that are not yet fully understood. We used structural and biochemical methods to assess early steps in DNA recognition by Cas12a protein-guide RNA complexes.
View Article and Find Full Text PDFThe widespread application of genome editing to treat or even cure disease requires the delivery of genome editors into the nucleus of target cells. Enveloped Delivery Vehicles (EDVs) are engineered virally-derived particles capable of packaging and delivering CRISPR-Cas9 ribonucleoproteins (RNPs). However, the presence of lentiviral genome encapsulation and replication components in EDVs has obscured the underlying delivery mechanism and precluded particle optimization.
View Article and Find Full Text PDFHigh-resolution, real-time imaging of RNA is essential for understanding the diverse, dynamic behaviors of individual RNA molecules in single cells. However, single-molecule live-cell imaging of unmodified endogenous RNA has not yet been achieved. Here, we present single-molecule live-cell fluorescence hybridization (smLiveFISH), a robust approach that combines the programmable RNA-guided, RNA-targeting CRISPR-Csm complex with multiplexed guide RNAs for efficient, direct visualization of single RNA molecules in a range of cell types, including primary cells.
View Article and Find Full Text PDFThe RNA-guided ribonuclease CRISPR-Cas13 enables adaptive immunity in bacteria and programmable RNA manipulation in heterologous systems. Cas13s share limited sequence similarity, hindering discovery of related or ancestral systems. To address this, we developed an automated structural-search pipeline to identify an ancestral clade of Cas13 (Cas13an) and further trace Cas13 origins to defense-associated ribonucleases.
View Article and Find Full Text PDFTwenty genetic therapies have been approved by the US Food and Drug Administration to date, a number that now includes the first CRISPR genome-editing therapy for sickle cell disease-CASGEVY (exagamglogene autotemcel, Vertex Pharmaceuticals). This extraordinary milestone is widely celebrated owing to the promise for future genome-editing treatments of previously intractable genetic disorders and cancers. At the same time, such genetic therapies are the most expensive drugs on the market, with list prices exceeding US$4 million per patient.
View Article and Find Full Text PDFThermostable clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas9) enzymes could improve genome-editing efficiency and delivery due to extended protein lifetimes. However, initial experimentation demonstrated Geobacillus stearothermophilus Cas9 (GeoCas9) to be virtually inactive when used in cultured human cells. Laboratory-evolved variants of GeoCas9 overcome this natural limitation by acquiring mutations in the wedge (WED) domain that produce >100-fold-higher genome-editing levels.
View Article and Find Full Text PDFTargeting proteins to specific subcellular destinations is essential in prokaryotes, eukaryotes, and the viruses that infect them. Chimalliviridae phages encapsulate their genomes in a nucleus-like replication compartment composed of the protein chimallin (ChmA) that excludes ribosomes and decouples transcription from translation. These phages selectively partition proteins between the phage nucleus and the bacterial cytoplasm.
View Article and Find Full Text PDFThe viral genome of SARS-CoV-2 is packaged by the nucleocapsid (N-)protein into ribonucleoprotein particles (RNPs), 38 ± 10 of which are contained in each virion. Their architecture has remained unclear due to the pleomorphism of RNPs, the high flexibility of N-protein intrinsically disordered regions, and highly multivalent interactions between viral RNA and N-protein binding sites in both N-terminal (NTD) and C-terminal domain (CTD). Here we explore critical interaction motifs of RNPs by applying a combination of biophysical techniques to ancestral and mutant proteins binding different nucleic acids in an in vitro assay for RNP formation, and by examining nucleocapsid protein variants in a viral assembly assay.
View Article and Find Full Text PDFUnlabelled: Targeting proteins to specific subcellular destinations is essential in prokaryotes, eukaryotes, and the viruses that infect them. Chimalliviridae phages encapsulate their genomes in a nucleus-like replication compartment composed of the protein chimallin (ChmA) that excludes ribosomes and decouples transcription from translation. These phages selectively partition proteins between the phage nucleus and the bacterial cytoplasm.
View Article and Find Full Text PDFLarge-genome bacteriophages (jumbo phages) of the proposed family Chimalliviridae assemble a nucleus-like compartment bounded by a protein shell that protects the replicating phage genome from host-encoded restriction enzymes and DNA-targeting CRISPR-Cas nucleases. While the nuclear shell provides broad protection against host nucleases, it necessitates transport of mRNA out of the nucleus-like compartment for translation by host ribosomes, and transport of specific proteins into the nucleus-like compartment to support DNA replication and mRNA transcription. Here, we identify a conserved phage nuclear shell-associated protein that we term Chimallin C (ChmC), which adopts a nucleic acid-binding fold, binds RNA with high affinity in vitro, and binds phage mRNAs in infected cells.
View Article and Find Full Text PDF