Machine learning (ML) has transformed protein engineering by constructing models of the underlying sequence-function landscape to accelerate the discovery of new biomolecules. ML-guided protein design requires models, trained on local sequence-function information, to accurately predict distant fitness peaks. In this work, we evaluate neural networks' capacity to extrapolate beyond their training data.
View Article and Find Full Text PDFMachine learning (ML) has transformed protein engineering by constructing models of the underlying sequence-function landscape to accelerate the discovery of new biomolecules. ML-guided protein design requires models, trained on local sequence-function information, to accurately predict distant fitness peaks. In this work, we evaluate neural networks' capacity to extrapolate beyond their training data.
View Article and Find Full Text PDFAngiotensin-converting enzyme 2 (ACE2) has been investigated for its ability to beneficially modulate the angiotensin receptor (ATR) therapeutic axis to treat multiple human diseases. Its broad substrate scope and diverse physiological roles, however, limit its potential as a therapeutic agent. In this work, we address this limitation by establishing a yeast display-based liquid chromatography screen that enabled use of directed evolution to discover ACE2 variants that possess both wild-type or greater Ang-II hydrolytic activity and improved specificity toward Ang-II relative to the off-target peptide substrate Apelin-13.
View Article and Find Full Text PDFUnderstanding how severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) interacts with different mammalian angiotensin-converting enzyme II (ACE2) cell entry receptors elucidates determinants of virus transmission and facilitates development of vaccines for humans and animals. Yeast display-based directed evolution identified conserved ACE2 mutations that increase spike binding across multiple species. Gln42Leu increased ACE2-spike binding for human and four of four other mammalian ACE2s; Leu79Ile had an effect for human and three of three mammalian ACE2s.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
November 2021
The mapping from protein sequence to function is highly complex, making it challenging to predict how sequence changes will affect a protein's behavior and properties. We present a supervised deep learning framework to learn the sequence-function mapping from deep mutational scanning data and make predictions for new, uncharacterized sequence variants. We test multiple neural network architectures, including a graph convolutional network that incorporates protein structure, to explore how a network's internal representation affects its ability to learn the sequence-function mapping.
View Article and Find Full Text PDFUnderstanding how human ACE2 genetic variants differ in their recognition by SARS-CoV-2 can facilitate the leveraging of ACE2 as an axis for treating and preventing COVID-19. In this work, we experimentally interrogate thousands of ACE2 mutants to identify over one hundred human single-nucleotide variants (SNVs) that are likely to have altered recognition by the virus, and make the complementary discovery that ACE2 residues distant from the spike interface influence the ACE2-spike interaction. These findings illuminate new links between ACE2 sequence and spike recognition, and could find substantial utility in further fundamental research that augments epidemiological analyses and clinical trial design in the contexts of both existing strains of SARS-CoV-2 and novel variants that may arise in the future.
View Article and Find Full Text PDFUnderstanding how SARS-CoV-2 interacts with different mammalian angiotensin-converting enzyme II (ACE2) cell entry receptors elucidates determinants of virus transmission and facilitates development of vaccines for humans and animals. Yeast display-based directed evolution identified conserved ACE2 mutations that increase spike binding across multiple species. Gln42Leu increased ACE2-spike binding for human and four of four other mammalian ACE2s; Leu79Ile had a effect for human and three of three mammalian ACE2s.
View Article and Find Full Text PDFUnderstanding how human ACE2 genetic variants differ in their recognition by SARS-CoV-2 can have a major impact in leveraging ACE2 as an axis for treating and preventing COVID-19. In this work, we experimentally interrogate thousands of ACE2 mutants to identify over one hundred human single-nucleotide variants (SNVs) that are likely to have altered recognition by the virus, and make the complementary discovery that ACE2 residues distant from the spike interface can have a strong influence upon the ACE2-spike interaction. These findings illuminate new links between ACE2 sequence and spike recognition, and will find wide-ranging utility in SARS-CoV-2 fundamental research, epidemiological analyses, and clinical trial design.
View Article and Find Full Text PDFBackground: The promise of biopharmaceuticals comprising one or more binding domains motivates the development of novel methods for de novo isolation and affinity maturation of virion-binding domains. Identifying avenues for overcoming the challenges associated with using virions as screening reagents is paramount given the difficulties associated with obtaining high-purity virus-associated proteins that retain the conformation exhibited on the virion surface.
Results: Fluorescence activated cell sorting (FACS) of 1.
Analyses of bloodborne nanoscale extracellular vesicles (nsEVs) have shown tremendous promise in enabling the development of noninvasive blood-based clinical diagnostic tests, predicting and monitoring the efficacy of treatment programs, and identifying new drug targets in the context of health conditions such as cancer and Alzheimer's disease. In this chapter we present a protocol for generating global nsEV proteomic profiles that can further the utility of nsEV analysis for the above biomedical applications by enlightening us of differences in protein abundance across normal and disease state nsEVs. This protocol features the use of magnetic particle-based immunoprecipitation to enrich highly purified populations of nsEVs directly from plasma or serum samples.
View Article and Find Full Text PDFMethods Mol Biol
January 2019
Analysis of nanoscale extracellular vesicles (nsEVs) present in blood, cell culture media, and other biofluids has shown tremendous promise in enabling the development of noninvasive blood-based clinical diagnostic tests, predicting and monitoring the efficacy of treatment programs, and providing molecular level insights into pathology that can enlighten new drug targets in the contexts of health conditions such as cancer and Alzheimer's Disease (AD). In this chapter, we present methods for using magnetic particle-based immunoprecipitation to enrich highly purified populations of nsEVs directly from plasma, serum, and other biofluids. These methods enable downstream analysis of nsEV protein and nucleic acid constituents in the contexts of both global omics profiling and quantification of individual protein or nucleic acid species of interest.
View Article and Find Full Text PDFDiagnostic assays that leverage bloodborne neuron-derived (neuronal) nanoscale extracellular vesicles (nsEVs) as "windows into the brain" can predict incidence of Alzheimer's Disease (AD) many years prior to onset. Beyond diagnostics, bloodborne neuronal nsEVs analysis may have substantial translational impact by revealing mechanisms of AD pathology; such knowledge could enlighten new drug targets and lead to new therapeutic approaches. The potential to establish three-dimensional nsEV analysis methods that characterize highly purified bloodborne nsEV populations in method of enrichment, cell type origin, and protein or RNA abundance dimensions could bring this promise to bear by yielding nsEV "omics" datasets that uncover new AD biomarkers and enable AD therapeutic development.
View Article and Find Full Text PDFBackground: Engineered antibodies with pH responsive cell surface target antigen-binding affinities that decrease at the acidic pH (5.5-5.8) within the endosomes have been found to have reduced susceptibility to degradation within the lysosomes and increased serum half-life.
View Article and Find Full Text PDFSystemic injection of granulocyte colony-stimulating factor (G-CSF) has yielded encouraging results in treating Alzheimer's Disease (AD) and other central nervous system (CNS) disorders. Making G-CSF a viable AD therapeutic will, however, require increasing G-CSF's ability to stimulate neurons within the brain. This objective could be realized by increasing transcytosis of G-CSF across the blood brain barrier (BBB).
View Article and Find Full Text PDFCrohn's Disease (CD) afflicts over half a million Americans with an annual economic impact exceeding $10 billion. Granulocyte macrophage colony-stimulating factor (GM-CSF) can increase patient immune responses against intestinal microbes that promote CD and has been effective for some patients in clinical trials. We have made important progress toward developing GM-CSF variants that could be more effective CD therapeutics by virtue of being less prone to neutralization by the endogenous GM-CSF autoantibodies that are highly expressed in CD patients.
View Article and Find Full Text PDFIntravenously injected granulocyte macrophage colony-stimulating factor (GM-CSF) has shown efficacy in Alzheimer's Disease (AD) and Parkinson's Disease (PD) animal studies and is undergoing clinical evaluation. The likely need for dosing of GM-CSF to patients over months or years motivates pursuit of avenues for delivering GM-CSF to circulation via oral administration. Flow cytometric screening of 37 yeast-displayed GM-CSF saturation mutant libraries revealed residues P12, H15, R23, R24, and K72 as key determinants of GM-CSF's CD116 and CD131 GM-CSF receptor (GM-CSFR) subunit binding affinity.
View Article and Find Full Text PDFSCHEMA structure-guided recombination is an effective method for producing families of protein chimeras having high sequence diversity, functional diversity, and thermostabilities greater than any of the parent proteins from which the chimeras are made. A key feature of SCHEMA chimera families is their amenability to a "sample, model, and predict" operation that allows one to characterize members of a small chimera sample set and use those data to construct models that accurately predict the properties of every member of the family. In this chapter, we describe applications of this "sample, model, and predict" approach and outline methods for designing chimera sample sets that enable efficient construction of models to identify useful sequence elements.
View Article and Find Full Text PDFWe describe an efficient SCHEMA recombination-based approach for screening homologous enzymes to identify stabilizing amino acid sequence blocks. This approach has been used to generate active, thermostable cellobiohydrolase class I (CBH I) enzymes from the 390 625 possible chimeras that can be made by swapping eight blocks from five fungal homologs. Constructing and characterizing the parent enzymes and just 32 'monomeras' containing a single block from a homologous enzyme allowed stability contributions to be assigned to 36 of the 40 blocks from which the CBH I chimeras can be assembled.
View Article and Find Full Text PDFA quantitative linear model accurately (R(2) = 0.88) describes the thermostabilities of 54 characterized members of a family of fungal cellobiohydrolase class II (CBH II) cellulase chimeras made by SCHEMA recombination of three fungal enzymes, demonstrating that the contributions of SCHEMA sequence blocks to stability are predominantly additive. Thirty-one of 31 predicted thermostable CBH II chimeras have thermal inactivation temperatures higher than the most thermostable parent CBH II, from Humicola insolens, and the model predicts that hundreds more CBH II chimeras share this superior thermostability.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
April 2009
SCHEMA structure-guided recombination of 3 fungal class II cellobiohydrolases (CBH II cellulases) has yielded a collection of highly thermostable CBH II chimeras. Twenty-three of 48 genes sampled from the 6,561 possible chimeric sequences were secreted by the Saccharomyces cerevisiae heterologous host in catalytically active form. Five of these chimeras have half-lives of thermal inactivation at 63 degrees C that are greater than the most stable parent, CBH II enzyme from the thermophilic fungus Humicola insolens, which suggests that this chimera collection contains hundreds of highly stable cellulases.
View Article and Find Full Text PDF