An empirical test of the relationship between the bootstrap and likelihood ratio support in maximum likelihood phylogenetic analysis.

Cladistics

Departamento de Zoologia, Instituto de Biociências, Universidade de São Paulo, Tv. 14, 101 - Butantã, São Paulo, SP, 05508-090, Brazil.

Published: June 2022

AI Article Synopsis

Article Abstract

In maximum likelihood (ML), the support for a clade can be calculated directly as the likelihood ratio (LR) or log-likelihood difference (S, LLD) of the best trees with and without the clade of interest. However, bootstrap (BS) clade frequencies are more pervasive in ML phylogenetics and are almost universally interpreted as measuring support. In addition to theoretical arguments against that interpretation, BS has several undesirable attributes for a support measure. For example, it does not vary in proportion to optimality or identify clades that are rejected by the evidence and can be overestimated due to missing data. Nevertheless, if BS is a reliable predictor of S, then it might be an efficient indirect method of measuring support-an attractive possibility, given the speed of many BS implementations. To assess the relationship between S and BS, we analyzed 106 empirical datasets retrieved from TreeBASE. Also, to evaluate the degree to which S and BS are affected by the number of replicates during suboptimal tree searches for S and pseudoreplicates during BS estimation, we randomly selected 5 of the 106 datasets and analyzed them using variable numbers of replicates and pseudoreplicates, respectively. The correlation between S and BS was extremely weak in the datasets we analyzed. Increasing the number of replicates during tree search decreased the estimated values of S for most clades, but the magnitude of change was small. In contrast, although increasing pseudoreplicates affected BS values for only approximately 40% of clades, values both increased and decreased, and they did so at much greater magnitudes. Increasing replicates/pseudoreplicates affected the rank order of clades in each tree for both S and BS. Our findings show decisively that BS is not an efficient indirect method of measuring support and suggest that even quite superficial searches to calculate S provide better estimates of support.

Download full-text PDF

Source
http://dx.doi.org/10.1111/cla.12496DOI Listing

Publication Analysis

Top Keywords

likelihood ratio
8
maximum likelihood
8
measuring support
8
efficient indirect
8
indirect method
8
method measuring
8
number replicates
8
datasets analyzed
8
support
6
empirical test
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!