The impact of missing data on real morphological phylogenies: influence of the number and distribution of missing entries.

Cladistics

Instituto de Botánica Darwinion (CONICET, ANCEFN), Labardén 200, Casilla de Correo 22, B1642HYD San Isidro, Buenos Aires, Argentina.

Published: June 2010

Here we explore the effect of missing data in phylogenetic analyses using a large number of real morphological matrices. Different percentages and patterns of missing entries were added to each matrix, and their influence was evaluated by comparing the accuracy and error of most parsimonious trees. The relationships between accuracy and error and different parameters (e.g. the number of taxa and characters, homoplasy, support) were also evaluated. Our findings, based on real matrices, agree with the simulation studies, i.e. the negative effect increases with the percentage of missing entries, and decreases with the addition of more characters. This indicates that the main problem is the lack of information, not just the presence of missing data per se. Accuracy varies with different distribution patterns of missing entries; the worst case is when missing data are concentrated in a few taxa, while the best is when the missing entries are restricted to just a few characters. The results expand our knowledge of the missing data problem, corroborate many of the findings previously published using simulations, and could be useful for empirical or theoretical studies.  © The Willi Hennig Society 2009.

Download full-text PDF

Source
http://dx.doi.org/10.1111/j.1096-0031.2009.00289.xDOI Listing

Publication Analysis

Top Keywords

missing data
20
missing entries
20
missing
9
real morphological
8
patterns missing
8
accuracy error
8
data
5
entries
5
impact missing
4
data real
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!