Impact of phylogeny on the inference of functional sectors from protein sequence data.

PLoS Comput Biol

Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.

Published: September 2024

Statistical analysis of multiple sequence alignments of homologous proteins has revealed groups of coevolving amino acids called sectors. These groups of amino-acid sites feature collective correlations in their amino-acid usage, and they are associated to functional properties. Modeling showed that nonlinear selection on an additive functional trait of a protein is generically expected to give rise to a functional sector. These modeling results motivated a principled method, called ICOD, which is designed to identify functional sectors, as well as mutational effects, from sequence data. However, a challenge for all methods aiming to identify sectors from multiple sequence alignments is that correlations in amino-acid usage can also arise from the mere fact that homologous sequences share common ancestry, i.e. from phylogeny. Here, we generate controlled synthetic data from a minimal model comprising both phylogeny and functional sectors. We use this data to dissect the impact of phylogeny on sector identification and on mutational effect inference by different methods. We find that ICOD is most robust to phylogeny, but that conservation is also quite robust. Next, we consider natural multiple sequence alignments of protein families for which deep mutational scan experimental data is available. We show that in this natural data, conservation and ICOD best identify sites with strong functional roles, in agreement with our results on synthetic data. Importantly, these two methods have different premises, since they respectively focus on conservation and on correlations. Thus, their joint use can reveal complementary information.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11449291PMC
http://dx.doi.org/10.1371/journal.pcbi.1012091DOI Listing

Publication Analysis

Top Keywords

functional sectors
12
multiple sequence
12
sequence alignments
12
impact phylogeny
8
sequence data
8
correlations amino-acid
8
amino-acid usage
8
synthetic data
8
functional
7
data
7

Similar Publications

Background: Smoking is the leading cause of preventable death in the United States. We interviewed Tobacco Cessation Champions, multi-sector decision makers, across the state of Michigan to assess and identify barriers and facilitators of smoking cessation and the current smoking cessation landscape.

Methods: Twenty Tobacco Cessation Champion interviews ( = 20) were completed with multi-sector decision makers to assess implementation barriers and facilitators from May 2022 to September 2023.

View Article and Find Full Text PDF

Development of antibodies for clinical use is a complex process involving numerous aspects, with antigen specificity being the most important. Initially, polyclonal antibodies, that can recognize multiple specific and nonspecific antigens (polyreactive), were developed and were very effective in the treatments. Later on, the polyspecificity/polyreactivity of these polyclonal antibodies (binding to multiple antigens) raised concerns about therapeutic efficacy because of their nonspecific interactions and challenges, such as development of immune complexes, batch-to-batch variability.

View Article and Find Full Text PDF

Oxygen electrocatalysis plays a pivotal role in energy conversion and storage technologies. The precise identification of active sites for oxygen reduction reaction (ORR) and oxygen evolution reaction (OER) is crucial for developing an efficient bifunctional electrocatalyst. However, this remains a challenging endeavor.

View Article and Find Full Text PDF

Objective: To codesign and develop an intervention to promote participation and well-being in children and young people (CYP) with acquired brain injury (ABI) and family caregivers.

Design: A complex intervention development study including a scoping review, mixed-methods study, co-design workshop and theoretical modelling.

Setting: Community-dwelling participants in one geographical region of the UK.

View Article and Find Full Text PDF

Recent trends in the elucidation of complex triterpene biosynthetic pathways in horticultural trees.

Hortic Res

January 2025

Metabolic engineering and Synthetic Biology Laboratory, Department of Natural Products, National Institute of Pharmaceutical Education and Research, S.A.S Nagar 160062, Punjab, India.

Triterpene (C30 isoprene compounds) represents the most structurally diverse class of natural products and has been extensively exploited in the food, medicine, and industrial sectors. Decades of research on medicinal triterpene biosynthetic pathways have revealed their roles in stress tolerance and shaping microbiota. However, the biological function and mechanism of triterpenes are not fully identified.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!