The genome is a sequence that encodes the DNA, RNA, and proteins that orchestrate an organism's function. We present Evo, a long-context genomic foundation model with a frontier architecture trained on millions of prokaryotic and phage genomes, and report scaling laws on DNA to complement observations in language and vision. Evo generalizes across DNA, RNA, and proteins, enabling zero-shot function prediction competitive with domain-specific language models and the generation of functional CRISPR-Cas and transposon systems, representing the first examples of protein-RNA and protein-DNA codesign with a language model.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
November 2024
Protein language models (pLMs) have emerged as potent tools for predicting and designing protein structure and function, and the degree to which these models fundamentally understand the inherent biophysics of protein structure stands as an open question. Motivated by a finding that pLM-based structure predictors erroneously predict nonphysical structures for protein isoforms, we investigated the nature of sequence context needed for contact predictions in the pLM Evolutionary Scale Modeling (ESM-2). We demonstrate by use of a "categorical Jacobian" calculation that ESM-2 stores statistics of coevolving residues, analogously to simpler modeling approaches like Markov Random Fields and Multivariate Gaussian models.
View Article and Find Full Text PDFCancer cells require high levels of iron for rapid proliferation, leading to significant upregulation of cell-surface transferrin receptor 1 (TfR1), which mediates iron uptake by binding to the iron-carrying protein transferrin. Leveraging this phenomenon and the fast endocytosis rate of TfR1 (refs. ), we developed transferrin receptor targeting chimeras (TransTACs), a heterobispecific antibody modality for membrane protein degradation.
View Article and Find Full Text PDFAn in vitro model of human ovarian follicles would greatly benefit the study of female reproduction. Ovarian development requires the combination of germ cells and several types of somatic cells. Among these, granulosa cells play a key role in follicle formation and support for oogenesis.
View Article and Find Full Text PDFReady-to-use therapeutic food (RUTF) containing less dairy may be a lower-cost treatment option for severe acute malnutrition (SAM). The objective was to understand the effectiveness of RUTF containing alternative sources of protein (nondairy), or <50% of protein from dairy products, compared with standard RUTF in children with SAM. The Cochrane Library, MEDLINE, Embase, CINAHL, and Web of Science were searched using terms relating to RUTF.
View Article and Find Full Text PDFMatern Child Nutr
October 2018
Treatment of acute malnutrition typically requires the provision of ready-to-use food (RUF). Common RUF is effective but expensive, being manufactured from costly ingredients, and shipped worldwide from few global suppliers. I developed a linear programming tool to create RUF optimized for low cost using locally grown crops while maintaining necessary nutritional goals and other constraints.
View Article and Find Full Text PDF