AlphaFold2 and related computational systems predict protein structure using deep learning and co-evolutionary relationships encoded in multiple sequence alignments (MSAs). Despite high prediction accuracy achieved by these systems, challenges remain in (1) prediction of orphan and rapidly evolving proteins for which an MSA cannot be generated; (2) rapid exploration of designed structures; and (3) understanding the rules governing spontaneous polypeptide folding in solution. Here we report development of an end-to-end differentiable recurrent geometric network (RGN) that uses a protein language model (AminoBERT) to learn latent structural information from unaligned proteins.
View Article and Find Full Text PDFProtein engineering has enormous academic and industrial potential. However, it is limited by the lack of experimental assays that are consistent with the design goal and sufficiently high throughput to find rare, enhanced variants. Here we introduce a machine learning-guided paradigm that can use as few as 24 functionally assayed mutant sequences to build an accurate virtual fitness landscape and screen ten million sequences via in silico directed evolution.
View Article and Find Full Text PDFRational protein engineering requires a holistic understanding of protein function. Here, we apply deep learning to unlabeled amino-acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily and biophysically grounded. We show that the simplest models built on top of this unified representation (UniRep) are broadly applicable and generalize to unseen regions of sequence space.
View Article and Find Full Text PDFPlants have significantly more transcription factor (TF) families than animals and fungi, and plant TF families tend to contain more genes; these expansions are linked to adaptation to environmental stressors. Many TF family members bind to similar or identical sequence motifs, such as G-boxes (CACGTG), so it is difficult to predict regulatory relationships. We determined that the flanking sequences near G-boxes help determine in vitro specificity but that this is insufficient to predict the transcription pattern of genes near G-boxes.
View Article and Find Full Text PDFPlants maximize their fitness by adjusting their growth and development in response to signals such as light and temperature. The circadian clock provides a mechanism for plants to anticipate events such as sunrise and adjust their transcriptional programmes. However, the underlying mechanisms by which plants coordinate environmental signals with endogenous pathways are not fully understood.
View Article and Find Full Text PDFTranscript levels are a critical determinant of the proteome and hence cellular function. Because the transcriptome is an outcome of the interactions between genes and their products, it may be accurately represented by a subset of transcript abundances. We develop a method, Tradict (transcriptome predict), capable of learning and using the expression measurements of a small subset of 100 marker genes to predict transcriptome-wide gene abundances and the expression of a comprehensive, but interpretable list of transcriptional programs that represent the major biological processes and pathways of the cell.
View Article and Find Full Text PDFIndependently evolved pathogen effectors from three branches of life (ascomycete, eubacteria, and oomycete) converge onto the Arabidopsis TCP14 transcription factor to manipulate host defense. However, the mechanistic basis for defense control via TCP14 regulation is unknown. We demonstrate that TCP14 regulates the plant immune system by transcriptionally repressing a subset of the jasmonic acid (JA) hormone signaling outputs.
View Article and Find Full Text PDFPlants are responsive to temperature, and some species can distinguish differences of 1°C. In Arabidopsis, warmer temperature accelerates flowering and increases elongation growth (thermomorphogenesis). However, the mechanisms of temperature perception are largely unknown.
View Article and Find Full Text PDFMany microbes associate with higher eukaryotes and impact their vitality. To engineer microbiomes for host benefit, we must understand the rules of community assembly and maintenance that, in large part, demand an understanding of the direct interactions among community members. Toward this end, we have developed a Poisson-multivariate normal hierarchical model to learn direct interactions from the count-based output of standard metagenomics sequencing experiments.
View Article and Find Full Text PDFPseudomonas syringae is a phylogenetically diverse species of Gram-negative bacterial plant pathogens responsible for crop diseases around the world. The HrpL sigma factor drives expression of the major P. syringae virulence regulon.
View Article and Find Full Text PDFMol Plant Microbe Interact
July 2012
Biotrophic phytopathogens are typically limited to their adapted host range. In recent decades, investigations have teased apart the general molecular basis of intraspecific variation for innate immunity of plants, typically involving receptor proteins that enable perception of pathogen-associated molecular patterns or avirulence elicitors from the pathogen as triggers for defense induction. However, general consensus concerning evolutionary and molecular factors that alter host range across closely related phytopathogen isolates has been more elusive.
View Article and Find Full Text PDFThe bacterial superantigen Staphylococcal enterotoxin-A (SEA), produced by some strains of Staphylococcus aureus, causes proliferation of cytotoxic T-lymphocytes and cytokine production in vivo. SEA has been shown to be highly efficient for antibody-targeted superantegen immunotherapy for different tumor models. A candidate B-cell superantigen that has received considerable attention these days is staphylococcal protein-A (PA).
View Article and Find Full Text PDFBiochem Biophys Res Commun
February 2002
The bacterial superantigen staphylococcal enterotoxin A (SEA) is a potent inducer of CTL activity and cytokine production in vivo. Protein A (PA) of Staphylococcal aureus has been found to have diverse biological response modifying properties and to possess antitumor, antitoxic and antiparasitic effects. In this study we examined the anti-tumor effect of these two superantigens used separately as well as in combination in mice carrying the Ehrlich ascites tumor.
View Article and Find Full Text PDF