Statistically based postprocessing of phylogenetic analysis by clustering.

Bioinformatics

Texas Institute for Computational and Applied Mathematics, University of Texas, ACES 6.412, Austin TX 78712, USA.

Published: October 2004

Motivation: Phylogenetic analyses often produce thousands of candidate trees. Biologists resolve the conflict by computing the consensus of these trees. Single-tree consensus as postprocessing methods can be unsatisfactory due to their inherent limitations.

Results: In this paper we present an alternative approach by using clustering algorithms on the set of candidate trees. We propose bicriterion problems, in particular using the concept of information loss, and new consensus trees called characteristic trees that minimize the information loss. Our empirical study using four biological datasets shows that our approach provides a significant improvement in the information content, while adding only a small amount of complexity. Furthermore, the consensus trees we obtain for each of our large clusters are more resolved than the single-tree consensus trees. We also provide some initial progress on theoretical questions that arise in this context.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/18.suppl_1.s285DOI Listing

Publication Analysis

Top Keywords

consensus trees
16
candidate trees
8
single-tree consensus
8
trees
7
consensus
5
statistically based
4
based postprocessing
4
postprocessing phylogenetic
4
phylogenetic analysis
4
analysis clustering
4

Similar Publications

Clinical Assessment of Drug Transporter Inhibition Using Biomarkers: Review of the Literature (2015-2024).

J Clin Pharmacol

January 2025

Drug Metabolism and Nonclinical Pharmacokinetics, Translational Medicine, Incyte, Wilmington, DE, USA.

As part of a narrative review of various publications describing the clinical use of urine- and plasma-based drug transporter biomarkers, it was determined that the utilization of coproporphyrin I, a hepatic organic anion transporting polypeptide (OATP) 1B1 and OATP1B3 biomarker, has been reported for 28 different drug-drug interaction (DDI) perpetrator drugs. Similarly, biomarkers for liver organic cation transporter 1 (isobutyryl-l-carnitine, N = 7 inhibitors), renal organic cation transporter 2 and multidrug and toxin extrusion proteins (N-methylnicotinamide, N = 13 inhibitors), renal organic anion transporter (OAT) 1 and 3 (pyridoxic acid, N = 7 inhibitors), and breast cancer resistance protein (riboflavin, N = 3 inhibitors) have also been described. Increased use of biomarkers has also been accompanied by modeling efforts to enable DDI predictions and development of multiplexed methods to facilitate their bioanalysis.

View Article and Find Full Text PDF

Economic evaluation of newborn screening for congenital cytomegalovirus infection: A systematic review.

Eur J Pediatr

January 2025

Division of Policy Evaluation, Department of Health Policy, Research Institute, National Center for Child Health and Development, 2-10-1 Okura, Setagaya-Ku, Tokyo, 157-8535, Japan.

Purpose: This systematic review analyzes economic evaluations of newborn screening for congenital cytomegalovirus (cCMV) infection to identify key factors influencing cost-effectiveness and differences in methodological approaches.

Methods: Following a pre-registered PROSPERO protocol (CRD42023441587), we conducted a comprehensive literature search across multiple databases on July 4, 2024. The review included both full economic evaluations (cost and outcomes) and partial economic evaluations (cost only).

View Article and Find Full Text PDF

Completing a molecular timetree of primates.

Front Bioinform

December 2024

Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, United States.

Primates, consisting of apes, monkeys, tarsiers, and lemurs, are among the most charismatic and well-studied animals on Earth, yet there is no taxonomically complete molecular timetree for the group. Combining the latest large-scale genomic primate phylogeny of 205 recognized species with the 400-species literature consensus tree available from TimeTree.org yields a phylogeny of just 405 primates, with 50 species still missing despite having molecular sequence data in the NCBI GenBank.

View Article and Find Full Text PDF

In Bayesian phylogenetic and phylodynamic studies it is common to summarise the posterior distribution of trees with a time-calibrated consensus phylogeny. While the maximum clade credibility (MCC) tree is often used for this purpose, we here show that a novel consensus tree method - the highest independent posterior subtree reconstruction, or HIPSTR - contains consistently higher supported clades over MCC. We also provide faster computational routines for estimating both consensus trees in an updated version of TreeAnnotator X, an open-source software program that summarizes the information from a sample of trees and returns many helpful statistics such as individual clade credibilities contained in the consensus tree.

View Article and Find Full Text PDF

Despite the identification of several dozen genetic loci associated with ischemic stroke (IS), the genetic bases of this disease remain largely unexplored. In this research we present the results of genome-wide association studies (GWAS) based on classical statistical testing and machine learning algorithms (logistic regression, gradient boosting on decision trees, and tabular deep learning model TabNet). To build a consensus on the results obtained by different techniques, the Pareto-Optimal solution was proposed and applied.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!