We estimate the global BOLD Systems database holds core DNA barcodes (rbcL + matK) for about 15% of land plant species and that comprehensive species coverage is still many decades away. Interim performance of the resource is compromised by variable sequence overlap and modest information content within each barcode. Our model predicts that the proportion of species-unique barcodes reduces as the database grows and that 'false' species-unique barcodes remain >5% until the database is almost complete. We conclude the current rbcL + matK barcode is unfit for purpose. Genome skimming and supplementary barcodes could improve diagnostic power but would slow new barcode acquisition. We therefore present two novel Next Generation Sequencing protocols (with freeware) capable of accurate, massively parallel de novo assembly of high quality DNA barcodes of >1400 bp. We explore how these capabilities could enhance species diagnosis in the coming decades.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5388885PMC
http://dx.doi.org/10.1038/srep46040DOI Listing

Publication Analysis

Top Keywords

dna barcodes
12
generation sequencing
8
species-unique barcodes
8
barcodes
6
replacing sanger
4
sanger generation
4
sequencing improve
4
improve coverage
4
coverage quality
4
quality reference
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!