How Good Are Statistical Models at Approximating Complex Fitness Landscapes?

Mol Biol Evol

Insitute for Integrative Biology, ETH Zürich, Zürich, Switzerland.

Published: September 2016

Fitness landscapes determine the course of adaptation by constraining and shaping evolutionary trajectories. Knowledge of the structure of a fitness landscape can thus predict evolutionary outcomes. Empirical fitness landscapes, however, have so far only offered limited insight into real-world questions, as the high dimensionality of sequence spaces makes it impossible to exhaustively measure the fitness of all variants of biologically meaningful sequences. We must therefore revert to statistical descriptions of fitness landscapes that are based on a sparse sample of fitness measurements. It remains unclear, however, how much data are required for such statistical descriptions to be useful. Here, we assess the ability of regression models accounting for single and pairwise mutations to correctly approximate a complex quasi-empirical fitness landscape. We compare approximations based on various sampling regimes of an RNA landscape and find that the sampling regime strongly influences the quality of the regression. On the one hand it is generally impossible to generate sufficient samples to achieve a good approximation of the complete fitness landscape, and on the other hand systematic sampling schemes can only provide a good description of the immediate neighborhood of a sequence of interest. Nevertheless, we obtain a remarkably good and unbiased fit to the local landscape when using sequences from a population that has evolved under strong selection. Thus, current statistical methods can provide a good approximation to the landscape of naturally evolving populations.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4989103PMC
http://dx.doi.org/10.1093/molbev/msw097DOI Listing

Publication Analysis

Top Keywords

fitness landscapes
12
fitness landscape
12
fitness
9
statistical descriptions
8
good approximation
8
provide good
8
landscape
6
good
5
good statistical
4
statistical models
4

Similar Publications

In the Mediterranean basin, urban forests are widely recognized as essential landscape components, playing a key role in nature-based solutions by enhancing environmental quality and providing a range of ecosystem services. The selection of woody plant species for afforestation and reforestation should prioritize native species that align with the biogeographical and ecological characteristics of the planting sites. Among these, L.

View Article and Find Full Text PDF

Enzyme engineering is limited by the challenge of rapidly generating and using large datasets of sequence-function relationships for predictive design. To address this challenge, we develop a machine learning (ML)-guided platform that integrates cell-free DNA assembly, cell-free gene expression, and functional assays to rapidly map fitness landscapes across protein sequence space and optimize enzymes for multiple, distinct chemical reactions. We apply this platform to engineer amide synthetases by evaluating substrate preference for 1217 enzyme variants in 10,953 unique reactions.

View Article and Find Full Text PDF

Active learning-assisted directed evolution.

Nat Commun

January 2025

Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, USA.

Directed evolution (DE) is a powerful tool to optimize protein fitness for a specific application. However, DE can be inefficient when mutations exhibit non-additive, or epistatic, behavior. Here, we present Active Learning-assisted Directed Evolution (ALDE), an iterative machine learning-assisted DE workflow that leverages uncertainty quantification to explore the search space of proteins more efficiently than current DE methods.

View Article and Find Full Text PDF

Every protein progresses through a natural lifecycle from birth to maturation to death; this process is coordinated by the protein homeostasis system. Environmental or physiological conditions trigger pathways that maintain the homeostasis of the proteome. An open question is how these pathways are modulated to respond to the many stresses that an organism encounters during its lifetime.

View Article and Find Full Text PDF

Endurance sports have witnessed an increase in female participation, demanding a constant and evolving reassessment of the specific physiological and health implications of female athletes. In the present review, we analyze cardiovascular, hematological adaptations and anthropometry and hormonal fluctuations highlighting sex-specific differences in response to exercise, with estrogen playing a fundamental role in modulating body composition and metabolic processes. Nutritional aspects, in particular energy availability, macronutrient distribution and hydration, are fundamental in supporting training demands and menstrual function.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!