Machine learning (ML) for protein design requires large protein fitness datasets generated by high-throughput experiments for training, fine-tuning, and benchmarking models. However, most models do not account for experimental noise inherent in these datasets, harming model performance and changing model rankings in benchmarking studies. Here we develop FLIGHTED, a Bayesian method of accounting for uncertainty by generating probabilistic fitness landscapes from noisy high-throughput experiments. We demonstrate how FLIGHTED can improve model performance on two categories of experiments: single-step selection assays, such as phage display and SELEX, and a novel high-throughput assay called DHARMA that ties activity to base editing. We then compare the performance of standard machine-learning models on fitness landscapes generated with and without FLIGHTED. Accounting for noise significantly improves model performance, especially of CNN architectures, and changes relative rankings on numerous common benchmarks. Based on our new benchmarking with FLIGHTED, data size, not model scale, currently appears to be limiting the performance of protein fitness models, and the choice of top model architecture matters more than the protein language model embedding. Collectively, our results indicate that FLIGHTED can be applied to any high-throughput assay and any machine learning model, making it straightforward for protein designers to account for experimental noise when modeling protein fitness.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10996587 | PMC |
http://dx.doi.org/10.1101/2024.03.26.586797 | DOI Listing |
Bull Math Biol
January 2025
Université Côte d'Azur, Inria, INRAE, CNRS, MACBES, Sophia Antipolis, France.
The sterile insect technique (SIT) is a biological control technique based on mass-rearing, radiation-based sterilization that can induce fitness costs, and releases of the pest species targeted for population control. Sterile matings, between females and sterilized males, can reduce the overall population growth rate and cause a fall in population density. However, a proportion of irradiated males may escape sterilization, resulting in what is called residual fertility.
View Article and Find Full Text PDFNat Commun
January 2025
Department of Microbiology, Biocenter, University of Würzburg, Würzburg, D-97074, Germany.
Paramount to human health, symbiotic bacteria in the gastrointestinal tract rely on the breakdown of complex polysaccharides to thrive in this sugar-deprived environment. Gut Bacteroides are metabolic generalists and deploy dozens of polysaccharide utilization loci (PULs) to forage diverse dietary and host-derived glycans. The expression of the multi-protein PUL complexes is tightly regulated at the transcriptional level.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
January 2025
Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109.
Life at all scales is surprisingly effective at exploiting new opportunities, as demonstrated by the rapid emergence of antimicrobial resistance and novel pathogens. How populations acquire this level of evolvability and the various ways it aids survival are major open questions with direct implications for human health. Here, we use digital evolution to show that changing environments facilitate the simultaneous evolution of high mutation rates and a distribution of mutational effects skewed toward beneficial phenotypes.
View Article and Find Full Text PDFEcol Lett
December 2024
Division of Animal Ecology, Department of Ecology and Genetics (IEG), Uppsala University, Uppsala, Sweden.
Evolutionary adaptation occurs when individuals vary in access to fitness-relevant resources and these differences in 'material wealth' are heritable. It is typically assumed that the inheritance of material wealth reflects heritable variation in the phenotypic abilities needed to acquire material wealth. We scrutinise this assumption by investigating additional mechanisms underlying the inheritance of material wealth in collared flycatchers.
View Article and Find Full Text PDFEcol Lett
January 2025
Systematic Botany and Functional Biodiversity, Institute of Biology, Leipzig University, Leipzig, Germany.
Trait-based approaches have been increasingly used to relate plants to soil microbial communities. Using the recently described root economics space as an approach to explain the structure of soil-borne fungal communities, our study in a grassland diversity experiment reveals distinct root trait strategies at the plant community level. In addition to significant effects of plant species richness, we show that the collaboration and conservation gradient are strong drivers of the composition of the different guilds of soil fungi.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!