Discovering Fragile Clades and Causal Sequences in Phylogenomics by Evolutionary Sparse Learning.

Mol Biol Evol

Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA.

Published: July 2024

Phylogenomic analyses of long sequences, consisting of many genes and genomic segments, reconstruct organismal relationships with high statistical confidence. But, inferred relationships can be sensitive to excluding just a few sequences. Currently, there is no direct way to identify fragile relationships and the associated individual gene sequences in species. Here, we introduce novel metrics for gene-species sequence concordance and clade probability derived from evolutionary sparse learning models. We validated these metrics using fungi, plant, and animal phylogenomic datasets, highlighting the ability of the new metrics to pinpoint fragile clades and the sequences responsible. The new approach does not necessitate the investigation of alternative phylogenetic hypotheses, substitution models, or repeated data subset analyses. Our methodology offers a streamlined approach to evaluating major inferred clades and identifying sequences that may distort reconstructed phylogenies using large datasets.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11247346PMC
http://dx.doi.org/10.1093/molbev/msae131DOI Listing

Publication Analysis

Top Keywords

fragile clades
8
evolutionary sparse
8
sparse learning
8
sequences
6
discovering fragile
4
clades causal
4
causal sequences
4
sequences phylogenomics
4
phylogenomics evolutionary
4
learning phylogenomic
4

Similar Publications

We introduce the 12th version of the Molecular Evolutionary Genetics Analysis (MEGA) software. This latest version brings many significant improvements by reducing the computational time needed for selecting optimal substitution models and conducting bootstrap tests on phylogenies using maximum likelihood (ML) methods. These improvements are achieved by implementing heuristics that minimize likely unnecessary computations.

View Article and Find Full Text PDF

Stick and leaf insects (Phasmatodea) are a moderately diverse order that comprises almost 3,500 extant species of large to very large often impressively camouflaged nocturnal herbivores. The order also stands out as one of the few insect orders that have until lately lacked a robust higher-level phylogeny and still the relationships between many New World taxa in particular remain unknown. The Diapheromerinae is one of the main lineages of the Occidophasmata and comprises a bulk of the diversity of New World stick insects.

View Article and Find Full Text PDF

The global spread of the Mpox virus represents a significant public health challenge, particularly for regions with fragile healthcare systems. Afghanistan, already grappling with severe healthcare deficits, faces the added threat of Mpox, a zoonotic disease caused by the monkeypox virus. The recent emergence of Clade Ib, a new strain associated with rising outbreak cases, exacerbates concerns as Afghanistan's healthcare infrastructure continues to deteriorate due to reduced international aid, healthcare worker shortages, and an ongoing socio-political crisis.

View Article and Find Full Text PDF

Discovering Fragile Clades and Causal Sequences in Phylogenomics by Evolutionary Sparse Learning.

Mol Biol Evol

July 2024

Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA.

Phylogenomic analyses of long sequences, consisting of many genes and genomic segments, reconstruct organismal relationships with high statistical confidence. But, inferred relationships can be sensitive to excluding just a few sequences. Currently, there is no direct way to identify fragile relationships and the associated individual gene sequences in species.

View Article and Find Full Text PDF

Phylogenomic analyses of long sequences, consisting of many genes and genomic segments, infer organismal relationships with high statistical confidence. But, these relationships can be sensitive to excluding just a few sequences. Currently, there is no direct way to identify fragile relationships and the associated individual gene sequences in species.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!