Modeling Site Heterogeneity with Posterior Mean Site Frequency Profiles Accelerates Accurate Phylogenomic Estimation.

Syst Biol

Department of Biochemistry and Molecular Biology, 5850 College Street, Dalhousie University, Halifax, Nova Scotia B3H 4R2, Canada.

Published: March 2018

Proteins have distinct structural and functional constraints at different sites that lead to site-specific preferences for particular amino acid residues as the sequences evolve. Heterogeneity in the amino acid substitution process between sites is not modeled by commonly used empirical amino acid exchange matrices. Such model misspecification can lead to artefacts in phylogenetic estimation such as long-branch attraction. Although sophisticated site-heterogeneous mixture models have been developed to address this problem in both Bayesian and maximum likelihood (ML) frameworks, their formidable computational time and memory usage severely limits their use in large phylogenomic analyses. Here we propose a posterior mean site frequency (PMSF) method as a rapid and efficient approximation to full empirical profile mixture models for ML analysis. The PMSF approach assigns a conditional mean amino acid frequency profile to each site calculated based on a mixture model fitted to the data using a preliminary guide tree. These PMSF profiles can then be used for in-depth tree-searching in place of the full mixture model. Compared with widely used empirical mixture models with $k$ classes, our implementation of PMSF in IQ-TREE (http://www.iqtree.org) speeds up the computation by approximately $k$/1.5-fold and requires a small fraction of the RAM. Furthermore, this speedup allows, for the first time, full nonparametric bootstrap analyses to be conducted under complex site-heterogeneous models on large concatenated data matrices. Our simulations and empirical data analyses demonstrate that PMSF can effectively ameliorate long-branch attraction artefacts. In some empirical and simulation settings PMSF provided more accurate estimates of phylogenies than the mixture models from which they derive.

Download full-text PDF

Source
http://dx.doi.org/10.1093/sysbio/syx068DOI Listing

Publication Analysis

Top Keywords

amino acid
16
mixture models
16
posterior site
8
site frequency
8
long-branch attraction
8
mixture model
8
mixture
6
pmsf
6
empirical
5
models
5

Similar Publications

Using the Key Characteristics Framework to Unlock the Mysteries of Aryl Hydrocarbon Receptor-Mediated Effects on the Immune System.

Annu Rev Immunol

January 2025

3Department of Environmental Medicine and Department of Microbiology and Immunology, University of Rochester, Rochester, New York, USA; email:

Initially discovered for its role mediating the deleterious effects of environmental contaminants, the aryl hydrocarbon receptor (AHR) is now known to be a crucial regulator of the immune system. The expanding list of AHR ligands includes synthetic and naturally derived molecules spanning pollutants, phytochemicals, pharmaceuticals, and substances derived from amino acids and microorganisms. The consequences of engaging AHR vary, depending on factors such as the AHR ligand, cell type, immune challenge, developmental state, dose, and timing of exposure relative to the immune stimulus.

View Article and Find Full Text PDF

Child Neurology: Severe -Related Congenital Muscular Dystrophy With Rapidly Progressive Encephalopathy Leading to Infantile Death.

Neurology

February 2025

Division of Clinical and Metabolic Genetics, Department of Paediatrics, The Hospital for Sick Children, University of Toronto, Ontario, Canada.

Pathogenic variants in cause congenital muscular dystrophy through hypoglycosylation of alpha-dystroglycan (OMIM #615350). The established phenotypic spectrum of GMPPB-related disorders includes recurrent rhabdomyolysis, limb-girdle muscular dystrophy, neuromuscular transmission abnormalities, and congenital muscular dystrophy with variable brain and eye anomalies. We report a 9-month-old male infant with congenital muscular dystrophy, infantile spasms, and compound heterozygous pathogenic variants (c.

View Article and Find Full Text PDF

In Immune-mediated Thrombotic Thrombocytopenic Purpura (iTTP), patients develop antibodies against ADAMTS13. The majority of patients exhibit inhibitory anti-spacer antibodies. Non-inhibitory antibodies binding to the carboxy-terminal CUB domains have been suggested to enhance the clearance of ADAMTS13 in iTTP.

View Article and Find Full Text PDF

Interaction between saflufenacil and ammonium glufosinate to control ryegrass.

Braz J Biol

January 2025

Universidade Federal de Santa Maria - UFSM, Departamento de Defesa Fitossanitária, Santa Maria, RS, Brasil.

Annual ryegrass (Lolium multiflorum Lam.) is one of the main weeds in subtropical cropping systems of Europe, Oceania and South America. Therefore, the hypothesis of this work is that the interaction between ammonium glufosinate and saflufenacil can be synergistic for ryegrass control.

View Article and Find Full Text PDF

Optical resolution photoacoustic imaging of uneven samples without z-scanning is transformative for the fast analysis and diagnosis of diseases. However, current approaches to elongate the depth of field (DOF) typically imply cumbersome postprocessing procedures, bulky optical element ensembles, or substantial excitation beam side lobes. Metasurface technology allows for the phase modulation of light and the miniaturization of imaging systems to wavelength-size thickness.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!