Rarity and Incomplete Sampling in DNA-Based Species Delimitation.

Syst Biol

Department of Life Sciences, Natural History Museum, London SW7 5BD, UK; Department of Life Sciences, Silwood Park Campus, Imperial College London, Ascot SL7 5PY, UK.

Published: May 2016

AI Article Synopsis

  • In a study of southern African chafers, nearly half of the identified species were singletons, and conventional DNA analysis methods frequently yielded inaccurate species delimitations unless multiple related groups were combined.
  • The findings suggest that including additional lineages from outside the primary focus can enhance the effectiveness of species delimitation methods, though methodological performance can be affected by variations in effective population sizes across species.

Article Abstract

DNA-based species delimitation may be compromised by limited sampling effort and species rarity, including "singleton" representatives of species, which hampers estimates of intra- versus interspecies evolutionary processes. In a case study of southern African chafers (beetles in the family Scarabaeidae), many species and subclades were poorly represented and 48.5% of species were singletons. Using cox1 sequences from >500 specimens and ∼100 species, the Generalized Mixed Yule Coalescent (GMYC) analysis as well as various other approaches for DNA-based species delimitation (Automatic Barcode Gap Discovery (ABGD), Poisson tree processes (PTP), Species Identifier, Statistical Parsimony), frequently produced poor results if analyzing a narrow target group only, but the performance improved when several subclades were combined. Hence, low sampling may be compensated for by "clade addition" of lineages outside of the focal group. Similar findings were obtained in reanalysis of published data sets of taxonomically poorly known species assemblages of insects from Madagascar. The low performance of undersampled trees is not due to high proportions of singletons per se, as shown in simulations (with 13%, 40% and 52% singletons). However, the GMYC method was highly sensitive to variable effective population size ([Formula: see text]), which was exacerbated by variable species abundances in the simulations. Hence, low sampling success and rarity of species affect the power of the GMYC method only if they reflect great differences in [Formula: see text] among species. Potential negative effects of skewed species abundances and prevalence of singletons are ultimately an issue about the variation in [Formula: see text] and the degree to which this is correlated with the census population size and sampling success. Clade addition beyond a limited study group can overcome poor sampling for the GMYC method in particular under variable [Formula: see text] This effect was less pronounced for methods of species delimitation not based on coalescent models.

Download full-text PDF

Source
http://dx.doi.org/10.1093/sysbio/syw002DOI Listing

Publication Analysis

Top Keywords

species delimitation
16
[formula text]
16
species
15
dna-based species
12
gmyc method
12
low sampling
8
population size
8
species abundances
8
sampling success
8
sampling
6

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!