Machine learning (ML) for protein design requires large protein fitness datasets generated by high-throughput experiments for training, fine-tuning, and benchmarking models. However, most models do not account for experimental noise inherent in these datasets, harming model performance and changing model rankings in benchmarking studies. Here we develop FLIGHTED, a Bayesian method of accounting for uncertainty by generating probabilistic fitness landscapes from noisy high-throughput experiments. We demonstrate how FLIGHTED can improve model performance on two categories of experiments: single-step selection assays, such as phage display and SELEX, and a novel high-throughput assay called DHARMA that ties activity to base editing. We then compare the performance of standard machine-learning models on fitness landscapes generated with and without FLIGHTED. Accounting for noise significantly improves model performance, especially of CNN architectures, and changes relative rankings on numerous common benchmarks. Based on our new benchmarking with FLIGHTED, data size, not model scale, currently appears to be limiting the performance of protein fitness models, and the choice of top model architecture matters more than the protein language model embedding. Collectively, our results indicate that FLIGHTED can be applied to any high-throughput assay and any machine learning model, making it straightforward for protein designers to account for experimental noise when modeling protein fitness.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10996587PMC
http://dx.doi.org/10.1101/2024.03.26.586797DOI Listing

Publication Analysis

Top Keywords

fitness landscapes
12
protein fitness
12
model performance
12
landscapes noisy
8
noisy high-throughput
8
machine learning
8
high-throughput experiments
8
account experimental
8
experimental noise
8
model
8

Similar Publications

The sterile insect technique (SIT) is a biological control technique based on mass-rearing, radiation-based sterilization that can induce fitness costs, and releases of the pest species targeted for population control. Sterile matings, between females and sterilized males, can reduce the overall population growth rate and cause a fall in population density. However, a proportion of irradiated males may escape sterilization, resulting in what is called residual fertility.

View Article and Find Full Text PDF

Paramount to human health, symbiotic bacteria in the gastrointestinal tract rely on the breakdown of complex polysaccharides to thrive in this sugar-deprived environment. Gut Bacteroides are metabolic generalists and deploy dozens of polysaccharide utilization loci (PULs) to forage diverse dietary and host-derived glycans. The expression of the multi-protein PUL complexes is tightly regulated at the transcriptional level.

View Article and Find Full Text PDF

Evolution takes multiple paths to evolvability when facing environmental change.

Proc Natl Acad Sci U S A

January 2025

Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109.

Life at all scales is surprisingly effective at exploiting new opportunities, as demonstrated by the rapid emergence of antimicrobial resistance and novel pathogens. How populations acquire this level of evolvability and the various ways it aids survival are major open questions with direct implications for human health. Here, we use digital evolution to show that changing environments facilitate the simultaneous evolution of high mutation rates and a distribution of mutational effects skewed toward beneficial phenotypes.

View Article and Find Full Text PDF

Inheritance of Material Wealth in a Natural Population.

Ecol Lett

December 2024

Division of Animal Ecology, Department of Ecology and Genetics (IEG), Uppsala University, Uppsala, Sweden.

Evolutionary adaptation occurs when individuals vary in access to fitness-relevant resources and these differences in 'material wealth' are heritable. It is typically assumed that the inheritance of material wealth reflects heritable variation in the phenotypic abilities needed to acquire material wealth. We scrutinise this assumption by investigating additional mechanisms underlying the inheritance of material wealth in collared flycatchers.

View Article and Find Full Text PDF

Plant Species Richness and the Root Economics Space Drive Soil Fungal Communities.

Ecol Lett

January 2025

Systematic Botany and Functional Biodiversity, Institute of Biology, Leipzig University, Leipzig, Germany.

Trait-based approaches have been increasingly used to relate plants to soil microbial communities. Using the recently described root economics space as an approach to explain the structure of soil-borne fungal communities, our study in a grassland diversity experiment reveals distinct root trait strategies at the plant community level. In addition to significant effects of plant species richness, we show that the collaboration and conservation gradient are strong drivers of the composition of the different guilds of soil fungi.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!