A reference-free approach to analyse RADseq data using standard next generation sequencing toolkits.

Rasmus Heller Casia Nursyifa Genís Garcia-Erill Jordi Salmona Lounes Chikhi Jonas Meisner Thorfinn Sand Korneliussen Anders Albrechtsen

Mol Ecol Resour

Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen N, Denmark.

Published: May 2021

Genotyping-by-sequencing methods such as RADseq are popular for generating genomic and population-scale data sets from a diverse range of organisms. These often lack a usable reference genome, restricting users to RADseq specific software for processing. However, these come with limitations compared to generic next generation sequencing (NGS) toolkits. Here, we describe and test a simple pipeline for reference-free RADseq data processing that blends de novo elements from STACKS with the full suite of state-of-the art NGS tools. Specifically, we use the de novo RADseq assembly employed by STACKS to create a catalogue of RAD loci that serves as a reference for read mapping, variant calling and site filters. Using RADseq data from 28 zebra sequenced to ~8x depth-of-coverage we evaluate our approach by comparing the site frequency spectra (SFS) to those from alternative pipelines. Most pipelines yielded similar SFS at 8x depth, but only a genotype likelihood based pipeline performed similarly at low sequencing depth (2-4x). We compared the RADseq SFS with medium-depth (~13x) shotgun sequencing of eight overlapping samples, revealing that the RADseq SFS was persistently slightly skewed towards rare and invariant alleles. Using simulations and human data we confirm that this is expected when there is allelic dropout (AD) in the RADseq data. AD in the RADseq data caused a heterozygosity deficit of ~16%, which dropped to ~5% after filtering AD. Hence, AD was the most important source of bias in our RADseq data.

Download full-text PDF	Source
http://dx.doi.org/10.1111/1755-0998.13324	DOI Listing

Publication Analysis

Top Keywords

radseq data

radseq

data

generation sequencing

radseq sfs

reference-free approach

approach analyse

analyse radseq

data standard

standard generation

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!